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Abstract 

An  algorithm  for  computing  the  components  of  an  undirected  graph  in  the  (asyn- 
chronous) APRAM  model  is  given;  the  algorithm  uses  0(n  +  e)  processes  and  O(logn) 
rounds. 

1      Introduction 

This  paper  is  part  of  an  investigation  of  the  effect  of  process  asynchrony  on  parallel  algorithm 
design.  As  is  well  known,  the  main  effort  in  parallel  algorithm  design  has  employed  the 
PRAM  model.  This  model  hides  many  of  the  implementation  issues,  allowing  the  algorithm 
designer  to  focus  first  and  foremost  on  the  structure  of  the  computationad  problem  at  hand 
-  synchronization  is  one  of  these  hidden  issues. 

In  turn,  the  work  on  sychronization  is  part  of  a  broader  research  effort  which  has  sought 
to  take  into  account  some  of  the  implementation  issues  hidden  by  the  PRAM  model.  Broadly 
speaking,  two  major  approaches  have  been  followed.  One  body  of  research  is  concerned 
with  asynchrony  and  the  resulting  non-uniform  environment  in  which  processes  operate^ 
[CZ89,CZ91,CZ90,Nis90,MPS89,MSP90].  The  other  body  of  research  has  considered  the 
effect  of  issues  such  as  latency  to  memory,  but  assumes  a  uniform  environment  for  the 
processes  [PU87,PY88,AC88,ACS89,Gib89]. 

The  PRAM  is  a  synchronous  model  and  thus  it  strips  away  problems  of  synchronization. 
However,  the  implicit  synchronization  provided  by  the  model  hides  the  synchronization 


•The  work  was  supported  in  part  by  NSF  grants  CCR-8902221  and  CCR-8906949,  and  by  a  John  Simon 
Guggenheim  Memorial  Foundation  Fellowship. 

'We  distinguish  between  processes  and  processors  in  order  to  emphasize  that  the  APRAM  is  not  a  machine 
model  but  rather  a  programming  model;  it  is  the  task  of  a  compiler  to  implement  the  programming  model 
on  actual  machines.  The  term  processor  will  be  used  to  refer  to  this  component  of  a  machine. 


costs  from  the  user.  In  many  cases,  an  algorithm  may  have  to  be  redesigned  in  order  to 
allow  it  to  run  efficiently  in  an  asynchronous  environment.  In  this  paper,  we  are  concerned 
with  the  design  of  a  connectivity  algorithm  which  performs  well  in  the  presence  of  the  non- 
uniformity  introduced  by  asynchrony.  Another  approach  to  the  problem  of  asynchrony  is  to 
seek  to  efficiently  compile  PRAM  algorithms  to  operate  in  asynchronous  environments;  this 
approach  is  followed  by  Martel,  Park  and  Subramonian  [MPS89,MSP90].  They  give  efficient 
randomized  simulations  of  arbitrary  PRAM  algorithms  on  an  asynchronous  model,  given 
certain  architectural  assumptions  (e.g.,  the  availability  of  a  compare&swap  instruction).  It 
is  not  clear  whether  similar  deterministic  compilers  exist;  in  the  absence  of  such  compilers, 
to  obtain  deterministic  <Jgorithms  it  appears  necessary  to  design  them  in  an  asynchronous 
environment;  this  is  the  focus  of  our  work. 

In  a  companion  paper,  [CZ91],  the  APRAM,  a  model  for  asynchronous  parallel  compu- 
tation, is  defined  (this  model  was  first  suggested  by  [KRS88a]).  By  contrast  with  the  PRAM 
model,  there  is  no  global  clock  governing  the  computation;  all  synchronization  required  by  a 
particular  application  has  to  be  provided  explicitly.  Consequently,  we  expect  the  complexity 
of  asynchronous  algorithms  to  be  at  least  as  large  as  that  of  their  synchronous  counterparts 
(for  now  the  costs  of  synchronization  have  to  be  explicitly  paid  for). 

[CZ91]  discusses  two  algorithms,  summation  along  an  implicit  binary  tree  and  recur- 
sive doubling.  They  show  that  efficient  APRAM  algorithms  can  be  obtained  from  known 
PRAM  algorithms  by  replacing  global  synchronization  by  local  synchronization.  The  result- 
ing algorithms  have  a  more  flexible  computation  structure  and,  thus,  perform  better  in  an 
asynchronous  environment. 

In  this  paper  we  consider  a  third  problem:  computing  the  connected  components  of  an 
undirected  graph.  Our  algorithm  is  substantially  different  from  all  known  PRAM  algorithms. 
The  key  to  designing  the  algorithm  was  to  identify  a  structure  for  the  computation  that 
would  be  flexible  enough  to  allow  the  computation  to  proceed  efficiently,  while  at  the  same 
time  be  rigid  enough  to  guarantee  that  the  algorithm  terminates  correctly.  The  result  is  an 
apparently  chaotic  algorithm,  whose  description  is  simple,  but  whose  analysis  is  not. 

The  presence  of  asynchrony  calls  for  new  complexity  measures.  Before  discussing  these, 
we  remark  that  there  are  two  types  of  synchronization  costs:  explicit  Bind  implicit.  By  explicit 
costs  we  mean  the  overhead  for  achieving  synchronization.  This  could  be,  for  instance,  the 
cost  of  executing  extra  code  that  must  be  added  tc  '"he  algorithm  in  order  to  synchronize. 
When  processes  proceed  at  different  speeds,  if  the  algorithm  is  required  to  proceed  in  lock 
step,  the  time  required  to  execute  a  step  is  dictated  by  the  slowest  process.  By  implicit 
costs  we  refer  to  the  cost  associated  with  lock  step  execution  apart  from  the  explicit  costs 


of  synchronization.  We  do  not  define  the  notion  of  explicit  and  implicit  costs  formally,  for 
we  do  not  think  we  have  enough  experience  to  justify  definitive  definitions. 

The  basic  measure  used  in  [CZ91]  is  the  rounds  complexity  measure  of  distributed  com- 
puting; this  allows  the  explicit  costs  of  synchronization  to  be  measured.  [CZ90]  introduces 
two  other  measures,  the  bounded  delays  measure  and  the  unbounded  delays  measure,  which 
allow  the  implicit  costs  of  synchronization  to  be  measured.  This  paper  will  only  be  concerned 
with  the  rounds  complexity  measure. 

There  are  a  considerable  number  of  PRAM  algorithms  for  computing  the  connected 
component  of  an  undirected  graph,  [HW90,HCS79,Wyl79,SV82,Vis84,Gaz86,CV87].  They 
all  adopt  the  following  basic  strategy:  select  a  subset  of  the  edges  forming  a  forest  and  then 
compress  each  tree  to  a  star  (a  tree  of  depth  1);  iterate.  Shiloach  and  Vishkin  gave  the  first 
O(logn)  algorithm  by  interleaving  these  steps;  it  uses  n  +  e  processes  under  the  CRCW 
PRAM  (n  is  the  number  of  vertices  and  e  is  the  number  of  edges).  Cole  and  Vishkin  gave 
an  algorithm  with  time  complexity  T{n)  =  O(logn)  using  (n  +  €)a{e,n)/T{n)  processes, 
where  a(e,  n)  is  the  inverse  Ackerman  function.  Their  algorithm  is  optimal  for  e  >  nlog*  n; 
however,  the  constants  hidden  behind  the  big-0  notation  are  quite  large  due  to  the  use  of 
expander  graphs. 

In  this  paper  we  present  an  asynchronous  APRAM  algorithm  for  graph  connectivity; 
an  algorithm  which  is  significantly  different  from  all  known  PRAM  algorithms.  At  a  very 
high  level,  our  approach  is  similar  to  the  approach  of  Shiloach  and  Vishkin;  however,  their 
algorithm  depends  substantially  on  the  implicit  synchronization  of  the  PRAM  model  and 
we  have  not  been  able  to  obtain  an  APRAM  algorithm  simply  by  slightly  modifying  their 
algorithm. 

We  relaxed  some  of  the  constraints  (on  building  forests)  present  in  the  known  PRAM 
algorithms;  in  fact,  we  may  build  structures  having  a  single  cycle.  In  order  to  cope  with 
the  cycles  in  the  compression  phase,  we  need  to  introduce  further  pointers.  This  results 
in  a  substantizilly  different  algorithm.  It  also  hcis  the  effect  of  reducing  the  required  syn- 
chronization, which  makes  the  algorithm's  progress  appear  rather  chaotic;  nonetheless,  we 
can  prove  that  the  algorithm  terminates  correctly  in  O(logn)  rounds  (rounds  are  defined  in 
Section  2).  (The  correctness  of  the  algorithm  relies  on  the  ability  to  read  and  write  records 
of  4  fields  atomically.)  The  asynchronicity  of  the  algorithm  makes  the  proof  of  correctness 
and  the  complexity  analysis  substantially  more  challenging  than  would  be  the  case  for  a 
PRAM  algorithm. 

Before  describing  our  algorithm  we  briefly  review  the  APRAM  model  and  the  rounds 
comple.xity  meaisure. 


2     The  APRAM  Model 

We  provide  here  a  short  description  of  the  APRAM  model,  which,  though  incomplete, 
suffices  for  understanding  the  algorithm  presented  in  this  paper.  For  the  full  details  the 
reader  is  referred  to  [CZ91].  The  APRAM  is  an  asynchronous  shared  memory  parallel 
computation  model.  It  comprises  a  collection  of  processes,  each  executing  operations  called 
events.  An  APRAM  computation  is  a  serialization  of  all  the  events  of  all  the  processes; 
there  may  be  more  than  one  such  serialization. 

In  order  to  assess  the  complexity  of  asynchronous  algorithms,  the  APRAM  model  pro- 
vides a  complexity  measure  called  the  rounds  complexity,  the  global  clock  used  by  the  PRAM 
model  is  replaced  by  a  virtual  clock.  This  approach  was  introduced  in  [PF77]  and  used  in 
[AFL83,LF81,KRS88b]  and  is  common  in  the  area  of  distributed  computing  (see  [Awe87], 
[Awe85,AG87]).  Consider  a  computation,  C.  A  virtual  clock  o/C  is  an  assignment  of  unique 
virtual  times  to  the  events  of  C;  the  times  assigned  are  a  non-decreasing  function  of  the 
event  number. 

The  virtual  clock  is  meant  to  correspond  to  the  "real"  time  at  which  the  operations  oc- 
curred in  one  possible  execution  of  the  algorithm,  called  a  computation.  The  time  difference 
between  two  consecutive  events  of  a  process  is  called  the  duration  of  the  later  event.  The 
length  of  a  computation  is  the  time  assigned  to  the  last  event  in  the  computation. 

Under  the  rounds  complexity,  a  virtual  clock  is  valid  if  the  duration  of  each  event  is 
at  most  one.  In  effect,  a  computation  is  divided  into  contiguous  segments,  called  rounds, 
where  each  segment  contains  at  least  one  event  from  each  process.  The  complexity  of  a 
computation  is  the  number  of  rounds  in  the  computation  maximized  over  all  possible  sub- 
divisions. The  rounds  complexity  of  an  algorithm  on  a  given  input  is  the  rounds  complexity 
of  the  computation  with  largest  complexity.  The  rounds  complexity  of  an  algorithm  is  then 
the  rounds  complexity  on  a  given  input  maximized  over  all  inputs  of  a  given  size.  [CZ90] 
uses  probabilistic  variants  of  the  virtual  clock  in  order  to  obtain  complexity  measures  which 
measure  the  implicit  costs  of  synchronization. 

It  is  often  easier  to  analyze  algorithms  in  terms  of  higher  level  constructs,  each  built  of 
a  constant  number  of  low  level  building  blocks.  [CZ91]  shows  that  the  resulting  complexity 
will  be  off  by  only  a  constant  multiplicative  factor,  which  we  ignore,  as  is  often  done  in 
asymptotic  analysis.  This  cillows  one  to  ignore  low  level  details  of  the  algorithm.  In  the 
discussion  of  the  connectivity  algorithm  we  refer  to  operations  rather  than  events,  where 
each  operation  comprises  a  constant  number  of  events. 


3     The  Algorithm 

The  input  is  a  graph  G  =  iV,E).  Let  n  =  \V\,  e  =  \E\.  The  goal  of  the  algorithm  is  to 
find  a  mapping,  /  :  V  •-►  V,  such  that  for  each  pair  of  vertices  u,  v,  /(u)  =  f{v)  if  and  only 
if  ti  and  v  are  connected  in  G.  We  describe  a  connectivity  algorithm  which  uses  0{n  +  e) 
processes  and  O(logn)  rounds. 

The  algorithm  computes  a  series  of  refinements.  It  manipulates  two  mappings,  A  and 
R,  on  the  vertices  of  G.  We  call  A{v)  the  vertex  ahead  of  v  and  R{v)  the  reference  of  v.  The 
algorithm  starts  by  initializing  A  and  R  to  the  identity  mapping  on  V;  A  and  R  are  then 
modified  while  maintaining  that  at  any  point  in  time,  for  any  vertex  r,  A{v),  R{v)  and  v 
are  connected  in  G.  We  show  that  upon  termination  we  have  the  desired  property:  u  and 
V  are  connected  in  G  if  and  only  if  A{u)  =  A{v).  As  an  aid  to  the  analysis  the  algorithm 
uses  an  additional  mapping:  p  :  V  <-*  V,  the  parent  mapping;  however,  the  algorithm  does 
not  explicitly  manipulate  p;  in  fact,  it  is  unaware  of  p. 

A  few  definitions  are  helpful.  A  pointer  graph  is  a  directed  graph  in  which  each  vertex 
has  out  degree  one;  an  edge  may  be  a  self-loop.  A  pointer  graph  is  a  tree  graph  if  each 
cycle  is  a  self-loop.  A  pointer  graph  is  weighted  if  there  is  a  weight  associated  with  each 
vertex.  We  will  use  v  to  denote  both  the  vertex  and  its  weight,  where  no  ambiguity  results. 
By  definition,  each  component  of  a  pointer  graph  has  exactly  one  cycle,  possibly  trivial  (a 
self-loop).  In  a  weighted  pointer  graph  with  unique  weights,  the  largest  element  in  a  cycle  is 
called  a  leader;  each  component  has  exactly  one  leader.  If  the  component  has  a  trivial  cycle, 
the  vertex  in  the  trivial  cycle  is  called  a  root  and  the  component  is  said  to  be  rooted.  For  any 
mapping,  /,  from  V  to  V,  define,  Ej,  the  edge  set  inducedhy  /,  by  Ej  =  {  {v,  f{v))  \  v  £  V  ]; 
likewise,  define  Gj,  the  pointer  graph  induced  by  f,  by  Gj  =  {V,Ef).  An  /-property  refers 
to  the  corresponding  property  with  respect  to  graph  Gj.  For  instance,  when  we  say  u  is  an 
/-ancestor  of  v  we  mean  u  is  an  ancestor  of  v  in  Gj  (i.e.,  u  is  reachable  from  v  in  Gj). 

Our  point  of  departure  is  the  PRAM  algorithm  of  Shiloach  and  Vishkin  [SV82].  Our 
algorithm  is  substantially  different  to  the  Shiloach/Vishkin  algorithm.  However,  in  order  to 
understand  our  approach  it  is  helpful  to  review  the  Shiloach/Vishkin  algorithm  and  explain 
why  it  is  difficult  to  implement  it  directly  on  the  APRAM  model. 

The  Shiloach/Vishkin  algorithm  manipulates  a  mapping.  A,  and  constructs  a  tree  graph 
Ga  on  the  vertex  set  of  the  input  graph.  Initially,  A{v)  =  v  for  each  vertex  v.  On  termina- 
tion, A{u)  =  A(v)  if  and  only  if  u  and  v  are  in  the  same  component  of  the  input  graph.  A 
vertex,  v,  is  called  a  root  if  A{v)  =  v. 

Their  algorithm  follows. 


Iterate  logs^j  2"  times 

For  each  vertex  v  in  parallel  do: 

1  A{v)  :=  AiA{v))  {*  doubling*) 
Assume  each  (undirected)  edge  is  represented  as  two  directed  edges. 
For  each  directed  edge  (u,  v)  in  parallel  do: 

2  if  A{u)  is  a  root  and  A{u)  <  A{v)  then 

A{A{u))  :=  A{v)  (*  booking  •) 

3  if  A{u)  is  a  root  and  A{u)  did  not  get  new  children  in  Steps  1  or  2  then 

A{A{u))  :=  A{v)  (*  stagnant  booking  *) 

For  each  vertex  v  in  parallel  do: 

4  A{v)  :=  AiA{v))  (*  doubling  *) 


To  show  that  the  algorithm  terminates  in  O(logn)  steps,  each  component  of  the  input 
graph  is  assigned  a  potential  with  respect  to  the  i4-graph:  it  is  equal  to  the  number  of 
i4-components  plus  the  sum  of  the  edge  heights  of  <ill  the  A-components  (the  edge  height  of 
a  one  vertex  component  is  defined  to  be  one).  The  potential  of  a  component  of  the  input 
graph  is  reduced  to  two  when,  for  each  pair,  u,  v,  of  vertices  in  this  component,  A{u)  =  A{v). 
Prior  to  this  point,  on  each  iteration  of  the  above  algorithm,  the  potential  of  the  component 
is  reduced  by  at  least  one  third.  For,  in  each  doubling,  /1-components  of  height  greater  than 
one  have  their  height  reduced  by  at  least  one  third,  while  i4-components  of  height  one  are 
hooked  to  each  other  in  Steps  2  and  3. 

Unfortunately,  it  is  not  clear  how  to  implement  this  algorithm  directly  in  the  APRAM 
model  without  synchronizing  all  the  processes  after  each  step.  The  progress  obtained  in  the 
doubling  step  is  guaranteed  only  as  long  as  the  graph  remains  acyclic;  to  ensure  this,  the 
hooking  of  Step  2  must  be  a  controlled  hooking  and  the  stagnant  hooking,  performed  in  Step 
3,  is  key  to  achieving  O(logn)  running  time;  however,  the  stagnant  hooking  can  be  executed 
by  the  process  associated  with  {u,v)  only  after  it  synchronizes  with  all  the  processes  that 
are  creating  new  links  to  A(u).  This  can  be  done  in  constant  time  in  a  synchronous  model 
but  it  is  not  clear  how  to  achieve  this  asynchronously  in  o(logn)  rounds. 

The  description  of  our  algorithm  uses  an  auxiliary  mapping  p.  Informally,  p  records  all 
the  links  created  by  the  algorithm  and  is  unaffected  by  the  doubling  step.  To  facilitate  the 
transition  to  our  algorithm  we  define  the  p  pointer  with  respect  to  the  Shiloach/Vishkin 
algorithm  although  it  is  not  needed  there,  p  is  initialized  to  the  identity  mapping  on  V. 
Whenever  a  root,  r,  is  hooked  to  a  vertex,  v,  (in  either  Step  2  or  Step  3)  p(r)  is  set  to  v.  p 
is  not  modified  during  the  doubling  steps  (Steps  1  and  4).  The  same  definition  for  p  is  used 
in  the  informal  description  of  our  algorithm,  below.  The  graph  Gp  is  called  the  underlying 
graph. 
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Our  key  departure  from  the  Shiloach  and  Vishkin  algorithm  is  that  we  allow  non-trivial 
cycles  in  the  underlying  graph.  So  we  are  forced  to  modify  the  doubling  procedure,  which 
we  do  as  follows.  Given  a  non-trivial  cycle,  our  goal  is  to  break  the  cycle  at  the  leader 
and  then  make  the  leader  the  root  of  the  component.  The  R  pointers  help  carry  out  this 
computation.  The  following  explanation,  while  not  completely  correct,  provides  the  intuition 
behind  the  algorithm.  Suppose  the  cycle  is  defined  by  p  pointers,  the  initial  values  of  the  A 
pointers.  Also  suppose  the  doubling  is  performed  by  the  A  pointers,  using  the  assignment 
j4(t;)  :=  A(A{v)).  (Note  incidentally,  that  for  an  even  length  cycle  if  these  assignments 
are  performed  simultaneously  the  cycle  partitions  into  two  new  cycles,  each  having  half 
the  length  of  the  original  cycle;  the  new  cycles  need  not  be  connected  in  the  A-graph. 
Clearly  we  need  to  retain  this  connectivity  information;  this  is  achieved,  implicitly,  by  the 
R  pointers.)  R{v)  is  defined  to  be  the  largest  vertex  on  the  path  from  p(v)  to  A{v).  Thus, 
initially  R{v)  =  p{v).  When  A{v)  is  updated  by  doubling,  R{v)  is  simultaneously  assigned 
max{il(j;),  R{A{v))}.  Clearly,  if  r  =  R{v),  v  is  the  largest  vertex  on  the  cycle  (the  leader  of 
the  cycle). 

We  say  that  a  vertex,  r,  can  see  a  p-leader,  /,  if  /  is  a  p-leader  and  R{v)  =  /.  The  doubling 
procedure  (and  the  rest  of  the  algorithm)  maintains  the  invariant  that  for  any  vertex,  v,  v, 
R(v)  and  A(v)  are  in  the  same  p-component  and  there  is  an  A-path  from  t;  to  a  vertex  that 
can  see  the  p-leader  of  that  component.  In  this  sense  the  doubling  procedure  does  not  break 
components. 

The  hooking  step  operates  on  roots  and  links  them  to  other  vertices.  In  order  to  make 
progress,  components  with  non-trivial  cycles  eliminate  these  cycles  by  promoting  the  p-leader 
to  a  root.  This  is  done  in  the  following  manner.  When  doubling  proceeds  in  a  cycle  with 
p-leader  /,  eventually  R{1)  becomes  equal  to  /;  also,  R{1)  =  I  only  if  /  is  a  leader.  Therefore, 
when  R{1)  =  I,  the  process  associated  with  /  promotes  /  to  a  root  by  setting  A{1)  =  /;  p(l) 
becomes  equal  to  /  as  well.  At  that  point  the  component  is  rooted  and  the  root  is  ready  to 
make  new  links.  At  any  time,  at  most  one  vertex  in  any  p-cycle  can  be  promoted  and  thus 
the  promotion  operation  does  not  disconnect  p-components. 

Note  that  when  a  vertex  is  promoted,  some  p-paths  might  be  broken.  As  a  consequence, 
for  a  vertex,  v,  A{v)  is  not  guaranteed  to  be  a  p-ancestor  of  v.  The  heart  of  the  correctness 
proof  of  the  algorithm  is  proving  that,  nonetheless,  v  is  promoted  only  if  it  is  the  largest 
element  in  a  p-cycle. 

Consider  a  vertex,  x,  in  a  cycle  with  leader  /.  As  soon  as  /  is  promoted,  /  is  the  most 
"advanced"  vertex  in  the  component;  in  order  to  make  progress,  i  must  (eventually)  recog- 
nize that  /  was  promoted  and  then  set  A{x)  to  /;  otherwise  the  pointer  A{x)  simply  chases 


around  the  (old)  cycle.  Note  that  if  R{x)  =  /,  R{x)  remains  equal  to  /  as  long  as  A{x)  is  on 
the  cycle.  However,  due  to  the  asynchronous  nature  of  the  algorithm,  testing  whether  R{x) 
is  a  root  does  not  suffice.  /  can  be  promoted  and  create  a  new  link  before  x  tests  whether 
/  is  a  root.  Therefore,  we  tag  each  vertex  in  a  manner  described  later.  When  a  vertex  is 
promoted  its  tag  is  modified;  using  the  tags  a  vertex  can  easily  tell  whether  another  vertex 
has  been  promoted  since  the  last  time  it  checked.  Thus,  if  x  realizes  that  R{x)  had  been 
promoted,  even  if  R{x)  is  no  longer  a  root,  x  treats  R{x)  as  if  it  were  A{x).  To  facilitate  the 
exposition  a  function  F  is  introduced:  For  each  v,  F{v)  =  R{v)  if  R{v)  had  been  promoted 
and  is  equal  to  A{v)  otherwise.  The  actual  doubling  procedure  then  becomes: 

R{v):=imx{Riv),RiFiv))} 
Aiv)  :=  F{Fiv)) 

In  Section  4  we  describe  the  graph  connectivity  algorithm  and  in  Section  5  we  prove  that 
the  algorithm  is  correct.  That  is,  if  the  algorithm  terminates,  upon  termination  any  two 
vertices,  u  and  v,  are  in  the  same  component  of  the  input  graph  if  and  only  if  A{u)  =  A{v). 
In  Section  6  we  specify  the  termination  conditions  for  both  the  edge  processes  and  the 
vertex  processes  (these  are  details  of  the  aJgorithm  that  are  not  given  in  Section  4,  below). 
We  conclude,  in  Section  8,  by  proving  an  O(logn)  bound  on  the  rounds  complexity  of  the 
algorithm. 

4      Pseudo  Code 

In  addition  to  A{v)  and  R(v),  a  vertex,  v,  is  assigned  two  variables,  next(v)  and  new{v), 
whose  purpose  will  become  clear  below.  For  each  undirected  edge,  (u,i')i  '"  ^he  input 
graph  there  are  two  associated  directed  edges,  {u,v)  and  {v,u);  each  directed  edge  has  an 
cissociated  process.  For  each  directed  edge,  e  =  {u,v),  e  points  to  the  directed  edge  (u,u) 
and  V'(e)  points  to  a  vertex  Ctdled  the  current  endpoint  of  e. 
Auxiliaury  Functions: 

•  Function  IsRoot(v)  for  vertex  v. 

Return  true  if  t;  is  a  root,  false  otherwise;  i;  is  a  root  iff  R{v)  =  v. 

•  Function  Promote(v)  for  vertex  v: 
Change  the  tag  of  v. 

•  Function  IsPromoted{v,  R{v)),  R{v)  a  pointer  to  a  vertex  w. 

Return  true  if  the  vertex  w  has  been  promoted  with  respect  to  vertex  v. 
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•  Function  Forward  of  r,  F{v),  r  is  a  vertex: 

If  JsPromoted{v,  R{v)),  return  R{v),  otherwise  return  A{v). 

Initializations: 

For  eacA  vertex  r: 

A{v)  :=  V 
Riv)  :=  t; 
next{v)  :=  v 
new{v)  :=  null 
For  each  directed  edge  e  =  (u,  v): 
V{e)  :=  u 
e:={v,u) 

Edge  Procedure: 

/ 
The  procedure  for  edge  e: 

1  V{e)  :=  F{V{e)) 

2  if  Vie)  f  y{e)  then  ntw{V{e'))  :=  e 

3  if  IsRooi{V{e))  then  V[t)  :=  F{nexi{V{t))) 

Remark:  In  the  Shiloach/Vishkin  algorithm,  the  edge  processes  write  new  A  values  to  the 
appropriate  vertices.  Due  to  the  asynchronous  nature  of  our  aJgorithm,  the  edge  processes 
do  not  modify  the  A  values  but  simply  write  suggestions  into  a  shared  variable  called  new. 
Thus,  an  edge,  e,  connecting  u  —  V{e)  to  f  =  \'[e)  would  suggest  to  u  to  link  to  v  by  writing 
e  into  new[u)\  all  the  edge  processes  with  endpoint  u  write  to  new{u)  independently.  If  u  is 
a  root,  the  vertex  process  associated  with  u  reads  newiu]  and  uses  the  value  there  to  create 
a  new  link.  For  reasons  that  have  to  do  with  the  analysis,  an  edge  advances  its  endpoints 
(Step  1),  writes  an  edge-id  into  new  and  not  a  vertex-id  (Step  2)  and  migrates  from  roots 
(Step  3).  (The  procedure  RootProc  executed  at  t  defines  the  update  to  neit(r),  also  called 
the  current  destination  of  r.) 

Vertex  Procedures:  There  are  two  procedures  for  the  vertices:  A  procedure  for  roots 
and  a  procedure  for  nonroots. 

The  procedure  for  vertex  v 

if  R(v)  =  V  then  Execute  procedure  RootProc(v) 
else  Execute  procedure  NonRootProc(v) 
end  if 


Each  root,  r,  performs  the  following  procedure: 

Procedure  RootProc(r): 

1  neit(r)  :=  F{next{r)) 

2  if  next{r)  =  r  then  next{r)  :=  F{V{new{r))) 
if  IsRoot{next{T))  and  neart(r)  ^  r  then 

^(r)  :=  A{r)  :=  neif(r) 
ne2t(r)  :=  r 
end  if 

Remark.  It  seems  counterproductive  for  a  vertex  to  create  a  new  link  to  one  of  its  descen- 
dants. For  this  reason,  roots  are  restricted  to  create  new  links  only  to  other  roots.  When 
a  vertex,  r,  becomes  a  root,  it  reads  a  suggestion  from  the  variable  new{r),  a  pointer  to 
an  edge  e.  It  then  commits  to  connect  to  the  component  containing  the  vertex,  V{e),  by 
writing  F{V{e))  into  next{r),  the  current  destination  of  r  (recall  V{e)  is  the  current  end- 
point  of  edge  e).  r  then  repeatedly  advances  its  current  destination  by  replacing  next{r) 
with  F{n€xt{r))  until  next{r)  is  a  root.  If  the  root,  s  =  next{r),  is  not  r,  r  proceeds  to 
create  a  link  to  s,  otherwise,  it  reads  a  new  suggestion  from  new(r)  and  repeats  the  process. 
Note  that  between  the  time  r  checked  whether  n€xt{r)  is  a  root,  5,  and  the  time  the  new 
link  is  created,  s  might  have  created  a  link  and  no  longer  be  a  root.  For  the  purpose  of  the 
analysis,  it  was  easier  to  advance  the  current  destinations  of  roots  as  described  in  the  code 
(Steps  1  and  2),  rather  than,  for  instance,  waiting  until  F{next{r))  is  a  root. 

Each  nonroot  performs  the  following  procedure.     (The  max  function  applied  to  two 
pointers  to  vertices  returns  the  pointer  to  the  larger  vertex.) 

Procedure  NonRootProc(i'): 

Advance  the  ahead  and  reference  pointers 
R{v):=max{R{v),R{F{v))} 
A{v)  :=  F{Fiv)) 
Check  ifv  became  a  leader  at  this  step  and  if  so  promote  it: 
if  R{v)  =  V  then 

A{v)  :=  new{v)  :=  next{v)  :=  v 
promote{v) 
end  if 


4.1      Implementation  Details 

The  Promotion  operation  and  the  computation  of  F  are  implemented  using  counters,  as 
follows.    Each  vertex  v  maintains  a  counter,  ctr(v),  initially  zero.    In  addition,  v  stores  a 
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second  counter  called  Rctr{v),  also  initially  zero.  For  each  vertex,  v,  we  need  to  distinguish 
the  time  intervals  delimited  by  its  promotions.  This  is  the  role  of  ctr{v):  Each  time  v  is 
promoted,  ctr(v)  is  incremented. 

The  algorithm  makes  the  following  atomicity  assumption:  When  a  vertex  is  accessed  its 
record  is  retrieved  atomically;  that  is,  the  four  values  A{v),  R{v),  ctr{v)  and  Rctr{v)  are 
read  simultaneously. 

For  each  R  pointer,  R{v),  we  store  a  corresponding  counter,  Rctr{v);  it  records  the  time 
interval  of  the  vertex  w  =  R{v)  being  indicated  by  this  pointer.  Thus  we  update  Rctr  as 
follows.  Promotion{v)  is  implemented  by  incrementing  ctr{v)  and  simultaneously  assigning 
Rctr{v)  the  new  value  of  ctr(r).  When  R{v)  is  updated  in  a  doubling  operation,  Rctr{v)  is 
simultaneously  updated:  it  is  given  the  Rctr  value  associated  with  the  new  value  of  R{v) 
(note  that  two  Rctr  values  are  at  hand;  one  for  each  possible  update  to  R{v)).  Finally,  when 
a  hooking  is  performed  by  root{r),  Rctr{r)  is  updated  to  be  equal  to  ctr{R{r)). 

Now,  it  is  strtiightforward  to  implement  F.  By  definition,  vertex  R(v)  has  been  promoted 
with  respect  to  v  if  and  only  if  Rctr{v)  ^  ctT{R{v)).  So  F{v)  is  assigned  A{v)  if  Rctr{v)  = 
ctr(R(v))  and  is  assigned  R{v)  otherwise. 

We  show  later,  in  Section  7.1,  that  each  variable  ctr{v)  is  incremented  at  most  n  times; 
so  the  counters  are  readily  stored  and  accessed. 

Finally,  we  describe  the  implementation  of  the  doubling  procedure,  the  procedure  for 
nonroot  vertices,  v.  v  starts  by  accessing  RivYs  record  at  step  tpi.  Let  ARv  and  RRv 
denote  the  values  of  A{R{v))  and  R{R{v))y  respectively,  at  step  tR. 

If  R{v)  had  been  promoted  with  respect  to  v  at  step  <r,  i'  accesses  RRv's  record,  at  step 
tfR,  to  determine  if  RRv  has  been  promoted;  if  not,  it  sets  FRv  to  ARv  and  otherwise  it  sets 
FRv  to  RRv.  Then  v  computes  newR(v)  :=  max{iZ(i'),  RRv]  and  assigns  newA{v)  ;=  FRv. 

While  if  R{v)  had  not  been  promoted  with  respect  to  v  at  step  </?,  v  accesses  A(t;)'s 
record,  at  step  (.4.  Let  AAv  &nd  ^/Iv  denote  the  values  of  .4(>l(v))  and  ^(^(r)).  respectively, 
at  step  t^.  Next,  v  accesses  RAv's  record,  at  step  <f^,  to  determine  if  RAv  has  been 
promoted;  if  not,  it  sets  FAv  to  AAv  and  otherwise  it  sets  FAv  to  RAv.  Now,  v  computes 
newR(v)  :=  max{R{v),  RA{v)}  and  assigns  n€wA{v)  :=  FAv. 

If  r  =  newR{v),  v  modifies  newA{v)  to  be  v  and  sets  newctr(v)  :=  ctr{v)  +  1  (otherwise, 
newctr{v)  :=  ctr(v)).  v  concludes  with  the  following  simultaneous  assignment  to  its  record: 

R{v),A{v),ctr(v)  :=  newR(v),newA{v),newctr{v) 

(There  is  also  an  associated  updated  to  Rctr{v)  which  we  have  not  detailed.) 
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The  p  Pointers.    The  p  pointers  are  updated  a£  follows. 

1.  Initially,  for  each  vertex,  v,  p{v)  :=  v. 

2.  When  a  root,  r,  is  hooked  to  a  vertex,  u,  p(r)  :=  u. 

3.  When  a  vertex,  /,  is  promoted,  p{l)  :=  /. 

4.  The  p  pointers  are  not  modified  by  the  doubling  operation. 

5     Correctness 

The  key  to  the  correctness  of  the  algorithm  is  that  for  any  vertex  v,  R{v)  =  v  only  if  v  is 
the  largest  element  in  a  cycle.  We  start  by  proving  this  property. 

Consider  a  computation  of  the  algorithm  and  let  v  be  any  vertex.  Recall  that  a  com- 
putation comprises  a  sequence  of  steps.  When  referring  to  the  graph  at  step  i  we  mean  the 
graph  after  the  event  of  step  i  was  completed  and  before  step  i  +  1  has  begun.  Let  Ptiv), 
At{v),  Rt{v),  Ft{v),  and  Vt(v)  denote  the  corresponding  values  at  step  t  and  let  Gt  denote 
the  graph  Gp  at  t  (i.e.  Gt  is  shorthand  for  Gp,). 

Define  the  edge  set  E'  to  be  the  set  of  edges  induced  by  the  collection  of  hooking  steps 
that  took  place  on  or  before  step  t;  i.e.  E'  =  {  (u,  w)  \  p,(u)  =  w  for  some  s  <  t}.  Define 
the  history  graph,  G*,  to  be  G'  =  {V,  E'). 

The  next  lemma  follows  directly  from  the  above  definitions  by  induction,  and  from  the 
fact  that,  for  any  t,  E'  C  E'j^^.  It  is  used  in  Theorem  5.1,  the  main  theorem  of  this  section. 

Lemma  5.1  For  any  vertex,  v,  and  for  any  step,  t,  there  is  a  path  in  G'  from  Pi{v)  to 
Rt{v)  to  At{v)  and  Rtiv)  is  the  largest  vertex  on  one  such  path. 

Proof.  Since  E'  grows  with  t,  the  only  concern  are  operations  that  change  A  and  R.  By 
inspection,  the  lemma  holds  for  hooking  operations.  For  the  doubling  operation  by  vertex 
t',  we  note  that  for  some  number,  s  <,t,  v  sets  At{v)  to  either  A,{At-\{v)),  R,{At-i{v)), 
A,{Rt-i(v)),  or  to  Rg{Rt-i{v)).  If  Rt(v)  is  modified,  it  is  either  set  to  Rs{At-i{v))  or  to 
R,{Rt-i(v)).  a 

Theo.rem  5.1    The  following  two  properties  hold  fo^  any  vertex,  v,  and  for  any  step,  t. 

1.  If  Rt{v)  =  V  then  v  is  the  largest  vertex  on  a  cycle  ofGf 

2.  Ifpt{v)  ^  Pj(J').  for  some  step  s  <  t.  then  for  any  vertex,  w,  reachable  from  Pa(i')  in 
G'  without  going  through  v: 
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(a)  There  is  a  path  in  Gt  from  w  to  v. 

(h)  V  >  xc.  ' 

Proof.  The  proof  is  by  induction  on  the  step  number.  Initially,  the  first  part  of  the 
theorem  holds  by  definition  and  the  second  part  holds  vacuously.  Assume  the  theorem 
holds  immediately  before  step  t.  We  show  it  holds  after  step  t. 

Edge  operations  do  not  affect  the  correctness  of  the  theorem.  There  are  three  types 
of  vertex  operations:  promotion,  doubling,  and  hooking.  The  only  steps  that  affect  the 
theorem  are  those  that  write  to  A{v)  and  R{v).  The  proof  has  three  cases  depending  on  the 
operation  completed  at  step  t. 

Promotion.  At  step  t,  v  was  promoted.  Then,  Rt{v)  =  Pi{v)  =  v  and  the  first  part  of 
the  theorem  follows.  Now,  Rt^i{v)  =  v  and  by  Part  1  of  the  induction  hypothesis  v  is  the 
largest  vertex  on  a  cycle  of  Gt-i-  It  follows  from  Part  2  of  the  induction  hypothesis  that  for 
each  vertex,  z,  on  a  cycle  of  G*_i,  no  vertex  larger  than  x  can  be  reached  from  i  without 
using  the  edge  to  p(_i(i).  As  v  is  the  largest  vertex  on  a  cycle  of  Gt-i,  v  is  the  largest 
vertex  reachable  from  Pt-i{v)  in  G*_i.  The  only  change  made  during  step  t  is  setting  Ativ) 
and  pt{v)  to  v.  Clearly  the  second  part  of  the  theorem  follows  for  v. 

Next,  we  show  Part  2  of  the  theorem  for  vertices  x  ^  v.  Part  2b  of  the  theorem  follows 
for  X  by  applying  Part  2  of  the  induction  hypothesis  tor  t  -  I  and  from  the  observation  that 
G'  —  (j'_i  U  {<  v,v  >);  i.e.,  G'  is  obtained  from  G'_i  by  adding  a  self  loop  (the  Gt-i  edge 
removed  by  the  promotion  appears  in  G'). 

To  show  Part  2a  of  the  theorem  for  x  it  suffices  to  show  that  if  for  any  step,  s,  s  <  t  -  I, 
p,{x)  ^  p(_i(x)  then  v  is  not  reachable  in  C'.j  from  p,{x)  without  going  through  x.  (For 
by  Part  2a  of  the  induction  hypothesis,  if  w  is  reachable  from  p,(x)  in  G*_j  without  going 
through  X,  then  there  is  a  path  in  Gt-\  from  w  to  i;  this  path  exists  in  Gt  also  unless  it 
includes  v;  but  the  assertion  then  shows  this  path  includes  x  contrary  to  assumption.)  So 
suppose,  for  the  sake  of  contradition,  that  v  is  reachable  from  p,{x)  without  going  through 
X.  It  follows  from  the  induction  hypothesis  (Part  2a)  that  there  is  a  path  in  G(_i  from  v  to 
X  and  from  Part  2b  that  x  >  v.  However,  this  contradicts  Part  1  which  states  that  v  is  the 
largest  vertex  of  a  cycle  of  Gt-\- 

Hooking.  At  step  <,  v  is  hoolced  to  u.  So,  u  7^  d,  At-\{v)  =  Rt-i(v)  =  v,  and  pt{v)  = 
Rt{v)  =  u\  the  first  part  of  the  theorem  follows.  Part  2  of  the  theorem  follows  for  v  from 
Part  2  of  the  induction  hypothesis  at  <  -  1  and  the  observation  that  pt_i(v)  =  v. 
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Now  consider  any  vertex  i  ^  v  and  let  s  be  any  step  number,  s  <  t  —  I  such  that 
Pt(x)  ^  pt-i{x).  To  show  that  the  theorem  holds  for  x  it  suffices  to  show  that  v  is  not 
reachable  from  p,(x)  without  going  through  x.  This  follows  from  Part  2a  of  the  induction 
hypothesis  at  <  —  1,  for  as  pt_i(«)  =  v,  there  is  no  path  from  v  to  x. 

Doubling.  At  step  t,  v  performs  a  doubling  step.  The  doubling  operation  does  not  affect 
the  p  pointers;  therefore,  G*  =  G*.,  and  pt{v)  =  pt-i{v).  Part  2  of  the  theorem  follows 
from  Part  2  of  the  induction  hypothesis.  To  complete  the  proof  we  have  to  show  that  if 
Rt(v)  =  V  then  v  is  the  largest  element  on  a  cycle  of  Gf 

Assume  that  Rtiv)  =  v.  Then,  by  Lemma  5.1,  there  is  a  path  in  G'_i  from  pt_i(r)  to 
r,  cind  V  is  the  largest  vertex  on  that  path;  let  P  be  the  shortest  such  path.  We  show  that 
f  is  a  path  in  Gt_i  =  Gf 

Consider  two  consecutive  vertices,  u  and  w,  on  P.  Since  P  is  a  shortest  path,  there  is  a 
path  in  G*_j  from  w  to  v  that  does  not  go  through  u.  As  v  is  larger  than  u,  it  follows  from 
Part  2  of  the  induction  hypothesis  that  pt-\{u)  =  w.  Since  u  and  w  were  arbitrary  vertices 
of  P,  we  conclude  that  P  is  a  path  in  Gt-i  from  pt-i{v)  to  v.  Thus,  v  is  the  largest  element 
on  a  cycle  of  G(_ i ,  and  also  of  G( . 

This  concludes  the  proof.  D 

Lemma  5.2   For  any  vertex,  v,  R{v)  is  a  p-ancestor  of  v. 

Proof.  By  Lemma  5.1,  for  any  step,  t,  there  is  a  path  from  pt{v)  to  Rt(v)  in  G'  for  which 
Rt{v)  is  the  largest  vertex;  let  P  be  the  shortest  such  path.  We  show  that  P  is  a  path  in 
Gf  (An  identical  proof  was  used  as  part  of  the  proof  for  Theorem  5.1). 

Consider  two  consecutive  vertices,  u  an-d  w,  on  P.  Since  F  is  a  shortest  path,  there  is  a 
path  in  G'  from  w  to  Rt{v)  that  does  not  go  through  u.  As  Rt{v)  is  larger  than  u,  it  follows 
from  Theorem  5.1  that  ptiu)  —  w.  Since  u  and  w  were  arbitrary  vertices  of  P,  we  conclude 
that  P  is  a  path  in  Gt  from  pt{v)  to  Rt{v).  D 

6      Correct  Termination 

The  purpose  of  the  edges  is  to  merge  components  of  the  underlying  graph.  As  soon  as  both 
enpoints  of  an  edge  point  to  the  same  vertex  the  eoge  is  no  longer  needed.  In  fact  it  does 
no  useful  work  at  any  later  time.  We  add  the  following  line  to  the  process  for  edge  e: 

if  V'(e)  =  V'(e)  or  e  terminated  then  Terminate 
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The  following  vertex  termination  code  is  added  to  the  vertex  processes: 

Termination  condition  for  vertex  v. 

if  All  the  edges  have  terminated  then 

if  V  is  a  root  and  next{v)  —  v  then  Terminate 
if  A{v)  is  a  terminated  root  then  Terminate 


Comment.  A  prefix  sum  computation  over  the  edge  processes  can  be  used  to  determine 
when  all  the  edge  processes  have  terminated. 

The  following  definitions  are  helpful.  Recall  that  for  a  root  r,  neit(r)  is  called  a  current 
destination;  u  is  called  a  possible  destination  if  there  is  a  root  vertex  r,  for  which  next(r)  =  r 
and  F{V{new{r)))  =  u;  that  is,  r  is  not  committed  and  u  is  the  endpoint  of  the  most  recent 
suggestion  given  to  r.  The  edge  (r,  neit(r))  is  called  a  current  destination  edge.  For  an  edge, 
e,  of  the  input  graph,  the  current  edge  corresponding  to  e  is  the  (virtual)  edge  {V(e),  V{e)). 

The  analysis  uses  an  auxiliary  graph,  Gp+ ,  which  is  obtained  from  Gp  by  replacing  each 
self  loop,  (r,  r),  in  Gp,  by  the  edge  (r,  next(r))  (r  must  be  a  root  of  Gp).  The  algorithm  is 
viewed  as  having  two  stages;  Stage  1  ends  when  all  the  edge  processes  terminate.  We  show 
that  at  the  end  of  Stage  1  the  p"*" -components  are  exactly  the  components  of  the  input  graph 
G  (where  the  components  are  considered  to  form  a  partitioning  of  the  vertices).  We  let  G'^ 
denote  the  graph  Gp+  at  step  t. 

\We  start  by  showing  that  components  of  G^  grow  monotonically. 

6.1      Components  of  the  Underlying  Graph 

Lemma  6.1  For  any  two  vertices,  u  and  v  and  any  step  t,  if  u  and  v  are  in  the  same 
component  ofGt  then  u  and  v  are  in  the  same  component  ojGf  for  any  t'  >  t. 

Proof.  Only  two  operations  modify  the  p  pointers:  a  hooking  and  a  promotion.  A  hooking 
operation  replaces  a  self  loop  by  a  new  edge  and  cannot  break  a  component  of  Gp.  By 
Theorem  5.1,  a  vertex  is  promoted  only  if  it  is  the  largest  vertex  in  a  cycle  of  Gp  and  thus 
a  promotion  cannot  break  components  of  Gp.  D 

Lemma  6.2  For  any  two  vertices,  u  and  v,  and  for  any  step  number,  t,  if  u  and  v  are  in 
the  same  component  of  G',  they  are  in  the  same  component  of  Gt. 

Proof.  The  edge  {u,v)  £  E'  if  and  only  if  p,{u)  =  v  for  some  s  <  t.  Then,  u  and  v  are 
in  the  same  component  of  Gj  and,  therefore  they  are  in  the  same  component  of  Gt  (by 
Lemma  6.1).  Q 
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Lemma  6.3  For  any  vertex,  v,  v,  A{v),  and  F(v)  are  in  the  same  component  ofGp. 

Proof.  The  lemma  follows  from  Lemma  5.1  and  Lemma  6.2.  O 

Lemma  6.4  For  any  pair  of  vertices  u  and  v,  and  for  any  step  number,  ti,  if  u  and  v  are 
in  the  same  component  of  Gf^  then  they  are  in  the  same  component  of  Gf^  for  any  <2  >  ^i  • 

Proof.  The  proof  is  by  induction  on  the  step  number.  Initially  the  lemma  holds.  Assume 
the  lemma  holds  at  step  f  for  aH  f  <  t.  We  show  it  holds  at  step  t. 

The  p"'"-edges  are  of  two  types:  p-edges  and  current  destination  edges,  p-components  are 
never  broken  (by  Lemma  6.1).  A  current  destination  edge,  {x,y),  with  i  /  y,  is  removed  in 
one  of  three  c<ises:  when  a  new  link  is  created  from  x  to  y,  when  y  =  i,  or  when  (x,  y)  is 
replaced  by  {z,F,(y)),  for  some  s  <  t. 

When  a  link  is  created,  linking  i  to  y,  the  p'^'-edge  (x,y)  is  replaced  by  a  p-edge  (x,y); 
this  operation  does  not  modify  the  p"'"-components.  If  y  =  x,  clearly  removing  (x,y)  from 
Gp+  does  not  alter  the  p"*"- components,  y  and  F,{y)  are  in  the  same  component  of  G,  (by 
Lemma  6.3)  and,  therefore,  in  the  same  component  of  Gt  (Lemma  6.1).  Thus,  replacing 
(x,y)  by  {x,F,(y))  does  not  alter  the  p^-components. 

D 

6.2      Edge  Termination 

Next,  we  show  the  correctness  of  the  edge  processes  by  showing  that  once  all  the  edge 
processes  terminate  the  p"*" -components  are  exactly  the  components  of  the  input  graph  G 
(Theorem  6.1).  This  proceeds  a^  follows:  Lemma  6.5  shows  that  despite  the  edge  migration, 
at  any  time,  the  current  endpoint,  V'(e),  of  an  edge,  e,  is  in  the  same  p"*"- component  as  its 
original  endpoint.  This,  together  with  Lemma  6.6,  which  shows  that  p"*" -components  are 
contained  in  components  of  G,  demonstrates  that  the  graph  resulting  from  adding  all  the 
current  (non-terminated)  edges  to  G'*'  has  the  same  components  as  the  input  graph.  (Recall 
that  the  current  edge  refers  to  the  (virtual)  edge  connecting  the  current  endpoints  of  an 
input  edge.) 

For  any  vertex  v,  let  C(v)  (resp.  C^{v))  denote  the  component  of  the  input  graph  (resp. 
of  Gf)  containing  v.  Lemma  6.7  shows  that  for  any  vertex,  v,  as  long  as  C^(v)  ^  C(v), 
there  is  a  non-terminated  edge,  e,  whose  endpoint  is  in  C^{v). 

Lemma  6.5  For  each  edge,  e.  and  for  any  two  steps,  t  <  t',  Vt(e)  and  Vf(e)  are  in  the 
same  p"^  -component  at  steps  >  t' . 
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Proof.  Let  t  <  t"  <  f.  The  proof  is  by  induction  on  the  step  number.  So  we  suppose 
inductively  that  Vt(e)  and  Vt»{e)  are  in  the  same  p^-component  at  steps  >  t".  Suppose 
that  at  step  f"  +  1,  an  endpoint,  v  =  Vfie),  migrates  either  to  F,{v)  or  to  F,i(neit,(r)),  for 
some  s  <  s'  <  t".  Ft{v)  and  v  are  in  the  same  component  of  G,;  and  thus,  by  Lemma  6.1, 
of  Gt'",  for  t"'  >  t".  Likewise,  w  =  next,(v),  and  Ff>{w)  are  in  the  same  component  of  Gt"', 
for  t'"  >  t".  Finally,  w  and  v  are  in  the  same  component  of  Gf  by  definition,  and  hence  of 
Gp,,  for  t"'  >  t",  by  Lemma  6.4.  Clearly  the  inductive  hypothesis  holds  for  f  =  t"  +  1;  so 
the  lemma  follows  by  induction.  O 

Lemma  6.6  For  any  pair  of  vertices,  u  and  v,  if  u  and  v  are  in  the  same  p"^ -component, 
then  they  are  in  the  same  component  of  the  input  graph,  G. 

Proof.  The  proof  is  by  induction  on  the  step  number.  Initially  each  vertex  is  a  p"*"- 
component  and  the  lemma  holds  vacuously.  Assume  the  lemma  holds  immediately  before 
step  t,  for  some  t.  The  ordy  new  intercomponent  edges  added  to  Gp+  at  step  t  are  current 
destination  edges.  A  new  current  destination  edge,  (x,y),  is  added  to  Gp+  only  if  i  is  a 
root  and  j/  is  a  possible  destination  of  x.  In  order  for  y  to  become  a  possible  destination 
of  X  there  must  be  an  input  edge,  e  =  {x',z'),  and  some  step  numbers,  ti  <  <2  <  '3  <  ^ 
for  which  Vi,(e)  =  x  and  V(j(e)  =  z,  where  Ft^{z)  =  y.  By  Lemma  6.5,  x  and  x'  are  in  the 
same  p^-component  at  step  ti,  and,  thus,  by  the  induction  hypothesis,  x  and  x'  are  in  the 
same  component  of  the  input  graph,  G.  Similarly,  z  and  z'  are  in  the  same  component  of 
the  input  graph  G. 

As  there  is  an  edge  in  G  from  x'  to  z',  x'  and  z'  are  in  the  same  component  of  G. 
Finally,  by  Lemma  6.3,  Ft^{z)  =  y  and  z  are  in  the  same  component  of  G(j.  Thus,  by  the 
inductive  hypothesis,  y  and  z  are  in  the  same  component  of  the  input  graph.  Therefore,  all 
the  vertices  in  the  new  p"*"- component  created  by  adding  the  current  destination  edge  {x.y) 
are  in  the  same  component  of  G,  as  required.  □ 

Lemma  6.7  Let  v  be  any  vertex,  and  let  t  be  any  step  number.  IfC^(v)  ^  C{v)  then  there 
is  an  edge,  e,  with  V(e)  G  C^iv)  and  V'(e)  ^  C^{v). 

Proof.  Assume  that  C^iv)  ^  C{v).  By  Lemma  6.6,  C/"(i;)  C  C{v)\  therefore,  there  must 
be  a  vertex,  u^  in  C[v)  —  C^iv). 

Since  v  and  w  are  in  the  same  component  of  G':here  is  a  path  in  G  from  v  to  w.  As 
V  G  C^iv)  and  w  ^  C^{v),  there  must  be  an  edge,  e  =  (x,y),  in  the  input  graph,  G,  on  a 
G-path  from  v  to  w  for  which  x  £  C^(v)  and  y  ^  C^{v).  By  Lemma  6.5,  at  step  /,  V({x,y)) 
is  in  C^(v)  and  V{{y,x))  £  C^y)  /  C^(v).  □ 

17 


Theorem  6.1  (Stage-I)  //  there  are  no  active  edge  processes  then  for  any  pair  of  vertices, 
u  and  V,  u  and  v  are  in  the  same  p^  -component  if  and  only  if  they  are  in  the  same  component 
of  the  input  graph,  G. 

Proof.  Let  u  and  v  be  two  vertices.  If  C(u)  =  C{v)  and  C'*'{u)  ^  C'^{v)  then  by  Lemma  6.7 
there  is  an  edge,  e,  with  one  endpoint  in  C'^{v)  and  one  outside.  By  Lemmas  6.4  and  6.5,  the 
endpoints  of  e  were  never  in  the  same  p"*"- component  and  thus  e  could  not  have  terminated. 
Conversely,  if  C'*'{u)  =  C'*'{v),  u  and  v  are  in  the  same  component  of  the  input  graph  by 
Lemma  6.6.  D 

6.3     Vertex  Termination 

The  main  theorem  of  this  section  is  Theorem  6.2  which  shows  that  if  the  algorithm  termi- 
nates, it  terminates  correctly. 

Lemma  6.8  For  any  vertex  v,  if  the  process  associated  with  v  terminated  then  A{v)  is  the 
root  of  the  p^  -component  containing  v. 

Proof.  Assume  that  aid  the  edge  processes  have  terminated.  There  are  two  ways  in  wliich  a 
vertex  v  can  terminate.  First,  if  t;  is  a  root  of  Gp  and  next{v)  —  v;  then,  v  is  also  a  root  of 
Gp-v .  Second,  if  A{v)  is  a  terminated  root  (and  hence  A(v)  ^  v)\  then,  v  is  not  a  root  of  Gp. 
So  a  root  of  Gp  terminates  only  if  it  is  also  a  root  of  Gp+.  After  a  process  associated  with 
a  vertex,  u,  starts  its  termination  procedure,  R{v)  and  A{v)  are  not  subsequently  modified. 
Therefore,  every  terminated  root  remains  a  root  in  Gp+ ;  likewise,  every  other  terminated 
node  remains  a  non-root  in  Gp  and  hence  in  Gp+ . 

For  a  root,  r,  in  Gp,  A{r)  =  r.  Thus,  on  termination,  a  root,  r,  in  Gp-^ ,  has  A{t)  =  r.  .A.s 
A{r)  is  not  modified  subsequently,  the  lemma  holds  for  terminated  roots.  For  a  non-root, 
V,  with  A{v)  a  terminated  root,  A(v)  must  be  a  root  of  Gp+.  v  and  A{v)  are  in  the  same 
p-component  (by  Lemma  6.3)  and,  therefore,  in  the  same  p"^ -component.  Thus,  A{v)  is 
the  (unique)  root  of  the  p"*" -component  containing  i'.  So  the  lemma  holds  for  terminated 
non-roots  as  well.  D 

Theorem  6.2  (Correct  Termination)  Let  u  and  v  be  any  two  vertices  whose  associated 
processes  have  terminated.  Then  u  and  v  are  in  the  same  component  of  the  input  graph  if 
and  only  if  A{u)  =  A{v). 

Proof.  Let  u  and  v  be  any  two  vertices  whose  processes  have  terminated.  By  Lemma  6.8, 
A{v)  is  the  root  of  the  G^  component  containing  v.  If  u  and  v  are  in  the  same  component 
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of  the  input  graph  then,  by  Theorem  6.1,  u  and  v  are  in  the  same  p^-component  and  thus 
A{u)  =  A{v).  On  the  other  hand,  if  A{u)  =  A{v)  then,  by  Lemma  6.8,  ti  and  v  are  in  the 
same  p^-component.  By  Theorem  6.1,  u  and  v  are  in  the  same  component  of  the  input 
graph.  D 

7     The  q  Graph 

The  analysis  of  the  rounds  complexity  uses  a  potential  function.  In  order  to  simplify  the 
analysis,  we  define  the  potential  of  the  graph  in  a  way  that  guarantees  the  potential  cajinot 
increase.  Consider  two  vertices,  u  and  v,  on  an  yl-path  (a  path  in  the  pointer  graph  induced 
by  the  A  pointers).  We  may  view  the  /4-distance  between  u  and  v  to  be  the  number  of  edges 
on  this  path.  Unfortunately,  we  cannot  guarantee  that  for  every  vertex,  v,  A{v)  and  F(v) 
are  p-ancestors  of  v.  Without  this  property,  it  was  hard  to  define  a  good  potential  function. 

In  this  section  we  define  a  second  underlying  graph,  called  the  ^-graph,  in  which  F{v)  is 
always  a  ^-ancestor  of  v  (A{v)  need  not  be  an  ^-ancestor  of  v  but  we  show  that  this  has  no 
adverse  effects).  We  then  define  the  potenticd  of  the  graph  as  a  function  of  the  f -distances. 
Replacing  the  p-graph  by  the  9-graph  simplifies  the  analysis  considerably. 

We  start  by  introducing  some  notation. 

Shadow  Variables.  The  analysis  examines  high  level  operations  such  as  doubling,  pro- 
motion and  hooking,  each  comprising  a  number  of  events.  However,  the  operations  are  not 
executed  atomicalJy;  an  operation  executed  by  a  process  might  use  information  it  read  at 
some  previous  step;  at  the  time  this  information  is  used  it  might  be  out  of  date. 

The  operations  performed  by  the  algorithm  modify  pointers  and  counters;  the  new  vailue 
assigned  to  a  pointer  is  a  value  read  during  the  operation.  Suppose  that  a  process,  P, 
performs  an  operation  on  vertex  r  which  writes  to  some  pointer,  for  example,  .4(r),  a  new 
value,  V.  Further,  suppose  that  the  actual  change  to  A(r)  occurred  at  some  step,  t.  Then,  at 
some  step,  s,  s  <  t,  of  the  operation,  the  process  read  the  value  v.  This  value  is  imagined  to 
be  assigned  to  a  shadow  of  variable  A(r),  denoted  A{r),  at  step  s.  Between  step  5  and  step  t 
other  processes  may  have  executed  events,  and  possibly  modified  other  variables,  including 
the  variable  from  which  A(r)  was  read.  Initially,  for  any  vertex,  u,  and  any  pointer,  A, 
A(u)  -  Aiu).  It  follows  by  induction  that  immediately  before  a  process  starts  executing  an 
operation  on  vertex  r,  A{r)  =  A{r). 

For  example,  suppose  that  a  hooking  operation  performed  at  a  root,  r,  begins  at  step  Sj. 
It  reads  the  record  associated  with  next{r)  at  step  52  (nei<(r)  is  a  pointer  associated  with 
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vertex  r  and  therefore  does  not  have  to  be  read  at  the  current  step),  r  receives  the  values 
associated  with  neit(r),  namely  At  =  i4(nexf(r))  and  Rr  =  R{next{r)).  Then,  at  step  sz, 
it  reads  the  record  associated  with  Rr  to  determine  the  value  of  F{n€xt{r)).  Suppose  that 
at  step  S4  it  writes  Ar  into  i4(r).  Then  at  step  S2,  A{r)  =  Ar  and  at  step  S4,  A{r)  =  Ar. 

Recall  the  function  F{v).  We  define  the  shadow  of  F(v),  F(v),  as  follows.  If  ctr{R{v))  > 
Rctr{v)  then  F{v)  =  R{v);  otherwise,  F{v)  =  A{v). 

The  shadow  variable  of  next{v)  is  a  special  case.  Suppose  v  is  a  root  and  it  is  executing 
a  step  which  assigns  next{v)  a  new  value,  w.  Let  f  be  the  step  in  which  v  reads  it;.  If  the 
operation  does  not  require  a  suggestion  from  an  edge  process,  next{v)  becomes  equal  to  w 
at  step  t'.  Otherwise,  n€xt{v)  is  updated  in  two  steps.  Suppose  that  v  reads  a  suggestion 
from  neit;(t;)  at  step  ti\  this  suggestion  is  a  pointer  to  an  edge,  e.  It  then  reads  the  value 
of  u  =  V(e)  at  some  step  t2,  t2  >  ti.  Define  next{v)  to  be  equal  to  u  at  step  ^2.  It  then 
becomes  equal  to  w  =  F{u)  at  step  t'. 

Followers  and  Followsets.  We  define  a  relation,  supporter,  among  the  vertices.  Initially, 
none  of  the  vertices  are  supporters.  If  a  p-cycle  with  leader  /  is  created  at  step  t,  then  all 
the  vertices  on  the  cycle,  except  /,  become  supporters  of  /.  In  the  next  lemmas  we  prove 
some  properties  of  the  supporter  relation,  including  that  it  is  a  partial  order;  we  then  define 
the  follower  relation  to  be  the  transitive  closure  of  the  supporter  relation. 

Lemma  7.1  For  any  vertex,  v,  and  for  any  step,  t,  if  v  has  a  larger  p-ancestor  at  step  t, 
it  has  a  larger  p-ancestor  at  step  t' ,  for  any  t'  >  t. 

Proof.  The  proof  is  by  induction  on  the  step  number.  Let  i'  be  a  vertex  and  assume  that  v 
has  a  larger  p-ancestor,  y,  at  step  5.  We  show  it  has  a  larger  p-ancestor  at  step  s  +  I.  If,  at 
step  5  +  1,  y  ceases  to  be  a  p-ancestor  of  v,  then  at  that  step  a  vertex,  x,  on  the  p-path  from 
V  to  y,  is  promoted.  In  that  case,  i  is  a  p-ancestor  of  v  at  step  5-1-1  and,  by  Theorem  5.1 
(p.  12),  I  >  y>  V.  D 

Lemma  7.2  If  v  is  a  supporter  at  step  t,  it  is  not  a  leader  at  step  t' ,  for  any  t'  >  t. 

Proof.  A  vertex  that  has  a  larger  p-ancestor  is  not  a  leader.  A  vertex,  u,  that  becomes  a 
supporter  at  step  s  has  a  larger  p-ancestor  at  that  step  and,  therefore,  by  Lemma  7.1,  is  not 
a  leader  at  any  step,  t'  >  s.  n 

Lemma  7.3   If  v  is  a  supporter  of  u  then  v  <  u  and  there  is  a  p-path  from  v  to  u. 


20 


Proof.  Let  t  be  any  step  and  assume  the  t;  is  a  supporter  of  u  at  step  t.  Then  at  some  step, 
s,  s  <  t,  V  became  a  supporter  of  u.  By  definition,  u  and  v  are  on  a  p,-cycle  and  v  <  u; 
thus,  there  is  a  p,-path  from  v  to  u.  Furthermore,  all  the  vertices  on  the  p,-path  from  v  to 
u  are  supporters  of  tx.  By  Lemma  7.2,  a  supporter  cannot  become  a  leader,  therefore  none 
of  these  vertices  change  their  p-pointers  at  any  step,  t',  t'  >  s.  It  follows  that  there  is  a 
Pt-path  from  »  to  u  and  v  <  u.  D 

Lemma  7.4   The  transitive  closure  of  the  supporter  relation  is  irreflezive. 

Proof.  It  follows  by  induction  from  Lemma  7.3  that  if  u  is  a  transitive  supporter  of  v  then 
u  <  V.  D 

Definition  7.1  Define  the  relation  follower  to  be  the  transitive  closure  of  the  supporter 
relation:  For  any  two  vertices,  u  and  v,  u  is  a  follower  of  v  if  u  is  either  a  supporter  of  v 
or  a  supporter  of  a  follower  of  v. 

Definition  7.2  Partition  the  vertices  into  sets,  called  followsets,  as  follows:  For  any  two 
vertices,  u  and  v,  u  and  v  are  in  the  same  followset  if  either  u  is  a  follower  of  v  or  if  both 
u  and  V  are  followers  of  some  vertex,  w. 

For  any  vertex,  v,  let  followset{v)  denote  the  followset  containing  v.  If  the  followset 
contains  exactly  one  non-follower  vertex,  that  vertex  is  called  the  head  of  the  followset  and 
is  denoted  by  head(v). 

Lemma  7.5  For  any  step,  t,  and  for  any  two  vertices,  u  and  v,  if  u  and  v  are  in  the  same 
followset  at  step  t  they  are  in  the  same  followset  at  step  t' ,  for  any  t' ,  t'  >  t. 

Lemma  7.6  For  any  vertex,  u,  if  u  is  a  supporter  of  some  vertex,  v,  at  step  t,  and  u 
becomes  a  supporter  of  some  vertex,  w,  w  ^  v,  at  step  t  +  1,  then  v  becomes  a  supporter  of 
w  at  step  t  +  I. 

Proof.    As  u  becomes  a  supporter  of  w  at  step  <  +  1,  u  is  on  a  p(+i-cycle  with  leader  w. 

By  Lemma  7.3,  there  is  a  p(+i-path  from  u  to  v;  therefore,  v  is  on  the  same  pt+i-cyde.  It 

follows  that  v  becomes  a  supporter  of  w  at  step  t  +  1.  D 

» 

Theorem  7.1   Let  v  be  any  vertex.   Then 

1.  followsei{v)  has  exactly  one  head.  head{v). 

2.  head{v)  is  the  largest  vertex  in  the  followset. 
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S.  There  is  a  p-path  from  v  to  head{v). 

Proof.  Lemma  7.4  implies  that  each  followset  has  at  least  one  non-follower  vertex  and 
Lemma  7.6  shows  it  has  at  most  one;  Part  1  of  the  theorem  follows.  This  implies  that  if 
V  jt  head{v),  v  is  a  follower  of  head{v).  Therefore,  there  is  a  sequence  of  vertices,  Vi,  V2, 
. . .,  vi,  such  that  «i  =  v,  vi  =  head{v)  and  for  any  i,  I  <  i  <  I,  r,  is  a  supporter  of  v.+i. 
The  last  two  parts  of  the  theorem  follow  from  Lemma  7.3  by  induction  on  the  length  of  the 
sequence.  D 

7.1  Finite  Counters 

In  this  section  we  digress  a  bit  and  show  that  a  node  can  be  promoted  at  most  n  —  1  times. 
From  this  it  follows  that  counters  of  logarithmic  size  suffice. 

Lemma  7.7  A  vertex,  v,  cannot  hook  to  one  of  its  followers. 

Proof.  The  proof  is  by  induction.  Initially  there  are  no  followers.  We  show  that  if  a  vertex 
u  becomes  a  follower  of  v  at  some  step  s  then  v  cannot  hook  to  u  at  any  step  s'  >  s. 

Let  u  and  v  be  any  two  vertices.  If  u  becomes  a  follower  of  v  at  step  s  (for  some  5), 
then,  at  step  5,  r  is  a  non-root  leader.  Furthermore,  by  Lemma  7.2  applied  to  u,  u  is  not  a 
leader  at  any  step  s'  >  s.  U  v  links  to  some  vertex,  w,  at  some  step,  t,  t  >  s,  then,  at  some 
step,  t\  s  <  t'  <  t,  w  is  di  root,  and  hence  a  leader.  It  follows  that  v  cannot  hook  to  u  at 
any  step  s'  >  s.  0 

Lemma  7.8  A  vertex,  v,  is  promoted  at  most  n  —  I  times. 

Proof.  By  Lemma  7.7,  each  time  a  p-cycle  is  created,  at  least  one  non-follower  vertex 
becomes  a  follower.  D 

7.2  The  q  pointers 

We  define  a  linear  ordering  among  the  vertices  of  a  followset  by  defining  a  mapping,  q, 
recursively,  as  follows.  Each  followset  has  two  designated  vertices  called  the  head  (which 
was  defined  above)  and  the  tail  (which  is  defined  here).  Let  v  be  any  vertex.  Initially,  v 
is  in  a  singleton  followset,  {v},  and  tailo{v)  =  heado{v)  =  v.  Whenever  a  p-cycle  with 
leader  /  is  created  at  step  t,  followsettU)  includes  all  the  vertices  on  the  cycle  and  all  their 
followers.  For  each  vertex,  u,  on  the  cycle,  except  /,  if  u  is  a  follower  before  step  t,  define 
qt(u)  =  qt-i(u);  otherwise,  define  qt{u)  =  tailt_i{pt{u)).  Also,  tailt(l)  =  <ai/(_i(p((/)).  This 
defines  the  q  mapping  for  followers. 
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Lemma  7.0  For  any  vertex,  v,  and  any  step,  s,  ifv  becomes  a  follower  at  step  s  then,  for 
any  t,  t  >  s 

1.  qtiv)  =  q,{v),  and 

2.  V  and  qt{v)  are  in  followsett{v). 

Proof.  Follows  from  the  definition.  D 

Lemma  7.10  For  any  vertex,  v,  there  is  a  simple  q-path  from  tail{v)  to  head{v).  The  path 
goes  through  all  the  vertices  of  followset{v). 

Proof.  The  proof  is  by  induction  on  the  step  number.  Assume  the  lemma  holds  for  some 
step,  t,  and  consider  step  t  +  1.  If  a  new  p-cycle  is  created  at  step  t  +  1  then  the  only 
non-follower  vertices  that  become  followers  at  step  t+l  are  vertices  on  the  new  p-cycle.  The 
lemma  for  step  t  +  1  follows  by  induction  on  the  number  of  such  non-foUowers. 

If  a  new  p-cycle  is  not  created  at  step  t-\-l,  the  lemma  for  < -I- 1  follows  from  the  induction 
hypothesis  and  Lemma  7.9.  □ 

Definition  7.3  We  extend  the  definition  of  the  q  mapping  to  non-followers  by  defining 
qt(v)  =  tailt(pt{v)),  for  every  non-follower,  v. 

Lemma  7.11    For  any  vertex,  v,  if  q(v)  is  not  in  followset(v),  then  q(v)  is  a  tail. 

Proof.  The  lemma  follows  from  Lemma  7.10.  '  D 

Define  the  ^-graph,  G,,  to  be  the  (pointer)  graph  induced  by  the  g-mapping.  We  now 
examine  the  relation  between  the  p-graph  and  the  7-graph.  The  main  result  of  this  section 
is  stated  in  Theorem  7.2  which  shows  that  F{v)  is  a  g-ancestor  of  v.  Let  G]  denote  the 
g-graph  at  step  t.  Define  the  mapping  q'^  as  follows:  For  any  vertex,  r,  if  r  is  a  root  then 
qf{r)  =  taili{nextt{r));  q^{r)  =  qt{r)  otherwise.  (Recall  that  a  vertex  is  a  root  if  and  only 
if  R{v)  =  V.)  Define  the  g^-graph,  G'"*",  to  be  the  graph  induced  by  the  mapping  q'^ . 

Lemma  7.12  For  any  two  vertices,  u  and  v,  if  there  is  a  q-path  from  u  to  v  then  all  the 
non-follower  vertices  on  this  path  are  p-ancestors  of  u. 

Proof.  The  proof  is  by  induction  on  the  length  of  the  path.  head{u)  is  the  first  non-follower 
vertex  on  any  g-path  starting  at  u  {head(u)  =  u  if  u  is  not  a  follower).  By  Theorem  7.1 
(p.  21),  there  is  a  p-path  from  u  to  head(u). 

Let  w  be  any  non-follower  vertex  on  the  9-path  from  u  to  v.  Then  q{w)  is  tail{p(w))  and 
all  the  vertices  on  the  g-path  from  tail(p(w))  to  p{w),  except  possibly  p{w),  are  followers, 
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by  definitioQ.  If  p(w)  is  a  follower,  then  the  first  non- follower  on  the  9-path  starting  at  p{w) 
is  head{p{w))  and  there  is  a  p-path  from  p{w)  to  head{p{w))  (by  Theorem  7.1  (p.  21)).  So, 
by  induction,  the  non-followers  on  the  9-path  starting  at  u  all  lie  on  a  p-path  in  the  same 
order  as  on  the  9-path.  0 

Lemma  7.13  For  any  vertex,  v,  p{v)  is  a  q-ancestor  of  v. 

Proof.  Let  v  be  any  vertex,  and  consider  some  step,  t.  If  v  is  not  a  follower  at  step  t,  then 
qtiv)  =  iaUt{ptiv))  (by  definition).  By  Lemma  7.10,  there  is  a  qt-p&th  from  tailt{pt{v))  that 
goes  through  all  the  vertices  of  the  followset  containing  Pt{v)',  in  particular,  this  path  goes 
through  pt{v). 

So  suppose  s  is  the  step  in  which  v  becomes  a  follower.  By  the  definition  of  the  q 
pointer,  qa{v)  =  tail,^i{pg{v)).  Let  u  =  p,{v).  There  is  a  g,_i-path  from  ta:7g_i(u)  to  u 
(Lemma  7.10).  All  the  vertices  on  this  path,  except  possibly  u,  are  followers  and,  therefore, 
do  not  chjoige  their  g-pointers  during  step  s  or  subsequently  (Lemma  7.9).  Therefore,  for 
t  >  s,  there  is  a  ^(-path  from  tail,_\{u)  =  q,{v)  =  ^((t;)  to  u  =  p,(u)  =  Pt{v). 

D 

Corollary  7.1   For  any  vertex,  v,  p'^{v)  is  a  q"^  -ancestor  of  v. 

Proof.  Note  that  if  q'^(v)  ^  q{v)  then  q'^iv)  =  tail{p^{v)).  The  corollary  follows  from 
Lemma  7.10.  D 

Lemma  7.14  Each  p'^ -component  is  a  q"^ -component. 

Proof.  By  Corollary  7.1.  each  p"*" -component  is  contained  in  a  9"'' -component.  It  remains  to 
show  they  are  equal.  It  suffices  to  show  that  v  and  w  =  q'^{v)  are  in  the  same  p'*'-component. 
If  V  and  w  are  in  the  same  followset,  head(v)  =  head{w),  which  by  Theorem  7.1,  part  3 
is  a  p-ancestor  of  t;  and  w;  so  v  and  w  are  in  the  same  p-component  and  hence  the  same 
p"*" -component.  If  v  and  w  are  in  distinct  foUowsets,  then  w  =  tail{p'^(v));  the  argument  of 
the  previous  sentence  show  that  p'^{v)  and  w  are  in  the  same  p"*"- component,  and  hence  so 
are  v  and  w.  D 

Lemma  7.15   For  any  vertex,  v,  if  v  has  a  larger  q-ancestor,  it  has  a  larger  p-ancestor. 

Proof.  By  Theorem  7.1,  the  head  of  a  followset  is  the  largest  vertex  in  the  followset;  also  it 
is  not  a  follower.  Let  w  be  the  largest  ^-ancestor  of  t;  and  let  x  —  head{w).  By  Lemma  7.12, 
X  is  a  p-ancestor  of  v;  further  it  is  at  least  as  large  as  w  and  hence  is  larger  than  v.  D 
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Lemma  7.16  All  the  vertices  of  a  q-cycle  comprise  a  single  followset. 

Proof.  Consider  a  9-cycle,  C.  U  C  has  two  non-follower  vertices,  u  and  v,  then  by  Lem- 
ma 7.12,  tx  and  v  are  on  a  p-cycle.  But,  each  p-cycle  can  have  at  most  one  non-foUower 
vertex.  It  follows  that  each  9-cycle  has  at  most  one  non-follower  vertex,  /.  Let  lu  be  a  vertex 
of  C.  By  Lemma  7.10,  C  includes  head{w),  which  is  a  non-follower  vertex.  As  i  =  head{w) 
is  the  only  non-follower  vertex,  q(x)  must  be  in  followset{w)  (for  otherwise  head{q{x)) 
would  be  a  different  non-follower  vertex  on  C).  So  q{head{w))  =  tail{w).  The  lemma  now 
follows  from  Lemma  7.10.  Q 

Lemma  7.17  For  any  two  vertices,  u  and  v,  if  there  is  a  path,  P\,  from  u  to  v  in  G^t-i  ^^^ 
none  of  the  vertices  on  this  path  are  promoted  at  step  t  then  there  is  a  path,  P2,  from  u  to 
V  in  G\;  also 

1-  Pi  C  P2, 

2.  all  the  vertices  in  P^  -  Pi  are  followers. 

Proof.  As  none  of  the  vertices  on  Pi  is  promoted,  the  only  way  a  vertex,  x,  on  Pi  can  change 
its  q  pointer  is  if  x  is  not  a  follower  and  at  step  t,  p«_i(i)  becomes  part  of  a  p-cycle  which 
does  not  go  through  x.  In  this  case,  pt{x)  =  pt-i{x)  and  qt(x)  =  tailt{pt{x)).  The  qrpz.t\i 
going  from  tailt{pt{x))  to  the  leader  of  the  new  cycle  goes  through  pt(x)  and  includes  all  the 
vertices  on  the  g(_i-path  from  qt-i{x)  =  tailt-i{pt-i{x))  to  pt-i(x)  =  P((x).  In  addition, 
all  the  vertices  added  are  followers  by  definition.  D 

Definition  7.4  For  any  pair  of  vertices,  u  and  v,  we  say  that  u  dominates  v  if  there  is  a 
q-path  from  v  to  u  and  u  is  larger  than  all  the  non-follower  vertices  on  this  path.  Similarly, 
we  say  that  v  forward  dominates  u  if  there  is  a  q-path  from  v  to  u  and  v  is  larger  than  all 
the  non-follower  vertices  on  this  path. 

Theorem  7.2  There  are  simple  q-paths  from  q(v)  to  Riv),  from  R{v)  to  F(v),  from  q{v) 
to  R(v)  and  from  Riv)  to  F{v);  also: 

1.  R{v)  dominates  q{v)  and  forward  dominates  F{v). 

2.  R(v)  dominates  q{v)  and  forward  dominates  F(v). 

Proof.  The  proof  is  by  induction  on  the  step  number.  The  theorem  is  maintained  by  any 
step  of  a  doubling  operation  and  any  step  of  a  hooking  operation,  by  Lemma  7.17.  We  must 
consider  the  effect  of  a  promotion. 
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A  vertex  can  be  promoted  only  if  it  is  the  largest  vertex  on  a  p-cyde.  Also,  followers 
cannot  be  promoted  (Lemma  7.2).  It  follows  that  R{v)  (resp.  R{v))  is  the  only  vertex  on  the 
9-path  from  q{v)  to  R{v)  (resp.  from  q{v)  to  R{v))  that  can  be  promoted  (for  by  Lemma  7.15 
any  other  vertex  on  this  path  has  a  larger  p-ancestor).  If  R{v)  (resp.  R{v))  is  promoted  at 
step  t,  Ft{v)  =  R{v)  (resp.  Ft{v)  =  R(v))  and  the  theorem  follows. 

If  a  vertex,  i  ^  R{v),  on  the  q-path  from  R{v)  to  F{v)  is  promoted,  then  head{R{v)) 
is  not  on  the  p-cycle  promoting  i  (for  head{R{v))  >  R{v)  >  i).  In  this  case,  after  the 
promotion,  there  is  a  9-path  from  R{v)  to  F{v)  that  does  not  go  through  i.  All  the  vertices 
on  the  new  g-path,  which  were  not  on  the  old  9-path,  are  followers,  by  definition.  Part  1  of 
the  theorem  follows.  An  identical  argument  replacing  R{v)  with  R(v)  and  F{v)  with  F(v) 
shows  that  Part  2  of  the  theorem  holds  as  well.  D 

8     Analysis 

The  analysis  proceeds  in  two  stages,  corresponding  to  the  two  stages  of  the  algorithm.  The 
input  graph,  G,  may  have  several  components.  The  goal  of  the  algorithm  is  to  select  a 
representative  for  each  component,  and  set  all  the  other  vertices  of  the  component  to  point 
to  the  representative. 

During  Stage  1,  the  algorithm  lini<s  components  of  the  underlying  graph  until  each  such 
component  contains  all  the  vertices  of  exactly  one  component  of  the  input  graph.  During 
this  stage,  we  divide  the  vertices  into  active  and  inactive  vertices.  Roughly  speaking,  the 
inactive  vertices  do  not  affect  the  performance  of  Stage  1. 

The  analysis  of  Stage  1  uses  a  potential  function  argument.  Recall  that  the  promotion 
operation  disconnects  the  underlying  p-graph  and  a  hooking  operation  disconnects  the  q- 
graph.  We  partition  the  graph  into  chains  and  define  a  depth  function  on  the  vertices 
relative  to  the  partition.  The  partition  we  chose  guarantees  that  chains  are  never  broken; 
were  a  chain  to  break,  the  sum  of  the  potential  of  its  two  parts  might  be  larger  than  the 
potential  of  the  original  chain.  The  potential  of  the  graph  is  a  function  of  the  sum  of  the 
depths  of  all  the  chains.  The  chciins  fall  into  two  classes,  long  chains  and  short  chains.  We 
show  that  in  each  round  the  long  chains  lose  a  constant  fraction  of  their  depth.  The  short 
chains  require  a  more  delicate  accounting  scheme.  We  show  that  a  constant  fraction  of  the 
short  chains  lose  a  constant  fraction  of  their  weight. 

During  Stage  2  of  the  algorithm  each  component  of  the  underlying  graph  corresponds 
to  a  component  of  the  input  graph.  In  addition,  the  only  vertices  that  can  be  promoted 
during  Stage  2  are  the  final  leaders  of  the  graph.   Consequently,  the  analysis  of  Stage  2  is 
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much  simpler. 

In  Section  9  we  divide  the  vertices  into  active  and  inactive  vertices.  We  then  define  a 
chain  partition  of  the  9-graph.  In  Section  10  we  use  this  partition  and  define  the  depth 
function.  Finally,  in  Section  11  we  define  the  potential  function  and  analyze  the  first  stage 
of  the  algorithm.  In  Section  12  we  analyze  the  second  stage  of  the  algorithm. 

9     Active  Vertices 

Recall  that  before  a  root  vertex,  r,  can  create  a  new  link  it  must  find  the  root  of  the 
component  to  which  it  is  attempting  to  hook.  If  r  is  the  root  of  a  star,  it  may  not  make  any 
"real"  progress  while  waiting.  The  component,  C,  r  is  trying  to  hook  to  may  be  shrinking 
quite  fast;  however,  C  may  have  many  stars  waiting  to  hook  onto  it.  Therefore,  the  progress 
C  makes  may  not  be  enough  to  offset  the  lack  of  progress  made  by  all  the  Wciiting  roots. 

We  overcome  this  difficulty  as  follows:  We  divide  the  vertices  of  the  graph  into  two 
disjoint  sets:  the  active  vertices  and  the  inactive  vertices.  The  anjJysis  uses  a  potential 
argument,  assigning  each  component  of  G^  a  potential;  the  potential  of  the  graph  is  the 
sum,  over  all  the  components,  of  the  potential  of  the  component.  A  portion  of  the  potential 
is  due  to  weights  assigned  to  the  vertices;  the  inactive  vertices  have  no  weight  and  thus  do 
not  contribute  to  the  potential. 

Recall  that  the  edge  processes  migrate  out  of  roots  along  the  current  destination  edges. 
This  guarantees  that  if  r  is  a  root  of  a  star  waiting  to  hook  to  some  component,  C,  after 
a  small  constant  number  of  steps  either  r  hooks  to  C  or  some  other  root  hooks  to  r,  in 
which  case  progress  is  made,  or  r  and  its  component  become  inactive,  decreasing  the  poten- 
tial. We  then  show  that  at  each  round,  the  potential  of  the  graph  decreases  by  a  constant 
multiplicative  factor.  This  reduction  is  aided  by  the  fact  that  some  of  the  vertices  become 
inactive  during  the  round. 

We  view  the  9-graph  as  divided  into  spans;  for  each  vertex,  v,  the  9-path  from  tail{v) 
to  head{v)  is  the  span  containing  v  (tjiere  is  such  a  path  by  Lemma  7.10).  v  is  called  an 
endpoint\{{oT  some  edge,  e.  of  the  input  graph  either  v  -  'V{e)  or  r  =  V'(e).  If  r  is  executing 
a  hooking  operation  and  A{v)  =  w  ^  v  we  say  that  v  is  in  the  process  of  hooking  to  w\ 
w  is  called  the  link  destination  of  v.  A  span  of  C'*'  is  called  an  endpoint  if  it  contains  an 
endpoLnt  of  G,;  it  is  called  a  link  destination  if  it  cor*  uns  a  link  destination. 

The  vertices  are  divided  into  active  and  inactive  vertices  as  follows.  For  any  edge,  e,  the 
endpoint  V'(e)  and  its  shadow  V(e)  are  active.  In  addition,  for  each  vertex,  v,  if  a  vertex 
in  the  span  containing  v  is  active  then  v  is  active.   Also,  if  v  is  active  then  next{v),  q{v). 
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next(v),  and  q(v)  are  active. 

A  vertex  is  called  a  potential  leader  if  it  is  either  a  leader  or  for  some  linlc  destination, 
u,  V  is  the  largest  vertex  on  the  p-path  from  u  to  the  leader  of  the  component;  if  v  is  not 
a  leader,  the  link  destination,  u,  is  said  to  be  in  the  promoting  set  of  v.  Intuitively,  if  r  is 
the  root  of  the  p-component  containing  v  and  u  is  in  the  promoting  set  of  v,  then  hooking  r 
to  ti  would  make  v  a  leader.  Vertices  which  are  not  potential  leaders  are  called  simple.  We 
show  that  a  simple  vertex  can  never  become  a  potential  leader  (Lemma  9.3). 

A  span  that  contains  a  potential  leader  is  called  a  potential  leader  span  or  simply  a 
potential  leader.  Lemma  9.1  shows  that  the  head  of  the  span  is  the  only  vertex  of  the  span 
that  can  be  a  potential  leader.  For  any  span,  .s,  we  call  the  span  containing  q{head{s))  the 
parent  of  s  and  denote  it  by  q{s);  s  is  called  a  child  ofq{s).  We  say  that  a  span  is  active  if 
its  vertices  are  active.  A  span  which  is  neither  a  potential  leader  nor  a  link  destination  and 
has  at  most  one  active  child  is  called  chained.  By  Lemma  7.11,  a  span  which  is  neither  a 
potential  leader  nor  a  link  destination  is  chained  if  and  only  if  its  tail  has  at  most  one  active 
9- child. 

Define  a  chain  partition  of  G,  to  be  a  partition  of  the  g-graph  into  disjoint  ^-paths,  called 
chains.  Each  chain,  C,  satisfies: 

1.  For  any  potential  leader,  /,  if  C  n  span(l)  ^  0  then  C  C  span(l). 

2.  If  V  is  active  but  not  a  potential  leader,  and  if  q{v)  is  in  the  chain  containing  v.  then  v 
is  the  only  active  ^-child  of  q{v).  In  addition,  if  v  and  q(v)  are  in  distinct  spans  then 
9(r)'s  span  is  not  a  link  destination. 

A  chain  partition,  U ,  is  a  natural  chain  partition  if,  in  addition: 

1.  Each  span  of  G,  is  contained  in  a  chain  of  U .  (This  implies  that  a  chain  which  contains 
a  potentiaJ  leader  comprises  exactly  one  span  of  G,.) 

2.  All  the  active  spans  contained  in  a  chain  form  a  sequence,  S\,S2, .  •  .5/,  satisfying: 

a)  s,+i  =  q{s,),  for  1  <  i  <  /, 

b)  for  any  i,  2  <  :  <  /,  5,  is  chained,  and 

c)  q(si)  is  not  chained. 

A  natural  chain  partition  is  unique  on  the  active  verti'-^s  of  the  graph;  chain  partitions  need 
not  be  unique,  but  on  the  active  vertices  they  are  refinements  of  a  natural  chain  partition. 
Consider  any  chain  partition,  U.  For  each  chain,  r,  of  U,  9(t),  next(T)  are  defined  in 
the  obvious  way.  Likewise,  we  say  a  chain  is  a  link  destination,  a  current  destination,  a  roof, 
a  leader,  a  potential  leader,  or  an  endpoint  if  it  contains  a  vertex  of  the  corresponding  type. 
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The  remainder  of  this  Bection  justifies  the  above  deftnitions  by  showing  that  inactive 
vertices  can  never  become  active  (Lemma  9.2)  and  that  simple  vertices  can  never  become 
potential  leaders  (Lemma  9.3).  Finally,  we  show  some  properties  of  natural  chain  paj'titions 
(Theorem  9.1).  These  are  used  in  the  next  section  to  guarantee  that  the  potential  function 
does  not  increase. 

Lemma  0.1  The  head  of  a  foUowset  is  the  only  vertex  in  the  followset  that  can  be  a  potential 
leader. 

Proof.  A  vertex  that  has  a  larger  p-ancestor  cannot  be  a  potential  leader  by  definition. 
The  lemma  follows  from  Theorem  7.1  (p.  21).  D 

Lemma  9.2  If  a  vertex  is  inactive  at  step  t,  it  is  inactive  at  step  t',  for  any  t'  >  t. 

Proof.  The  proof  is  by  induction  on  the  step  number.  Assume  the  lemma  holds  for  every 
step  up  to  and  including  step,  5,  for  some  step  s,  s  >  t.  We  show  it  holds  for  step  3  +  1. 
It  suffices  to  show  that  for  any  vertex,  v,  if  v  is  active  at  step  s  then  ^,+1(1;),  next,+i(v), 
q,+i{v)  and  next,+-[{v)  are  active  vertices  at  step  5. 

If  n€xt{v)  is  modified  at  step  5  +  1,  then  next,^i{v)  =  next,(v)  by  definition.  q(v)  is 
modified  at  step  5  +  1  in  one  of  two  cases:  When  v  is  executing  a  hooking  step  and  when 
p,{v)  becomes  part  of  a  cycle  which  does  not  include  v  (note  that  q{v)  is  not  modified 
by  a  promotion  operation).  In  the  former  case,  q,+i(v)  =  qa(v).  In  the  latter  case,  v  is 
not  a  follower  and  q,+i{v)  =  tail,+i{p,+i{v))  (see  Definition  7.3  (p.  23)).  In  this  case, 
p,+i{v)  =  Paiv).  As  p,{v)  is  a  gj-ancestor  of  v  (Lemma  7.13)  p,{v)  is  active  at  step  s  and  it 
follows  that  tail,+i{p,{v))  is  active  at  step  5  as  well. 

The  shadow  variable  q(v)  is  modified  only  when  v  is  executing  a  step  hooking  it  to 
some  vertex,  u.  In  this  case,  q,+i(v)  =  F,{n€xt,{v))  which  is  a  g,-ancestor  of  next,(v)  (by 
Theorem  7.2  (p.  25)). 

The  shadow  variable  n€xt(v)  is  modified  in  three  cases:  1)  when  next{v)  is  being  updated 
to  F(next{v)).  Then  the  lemma  follows  from  Theorem  7.2  (p.  25)).  2)  v  just  read  a 
suggestion  from  edge  e.  Then  next,+i(v)  =  V,{e),  and  the  lemma  follows.  3)  v  read  the 
vertex,  w  =  F,(next,{v)),  which  it  will  assign  to  next{v).  Then  w  is  a^j-ancestor  of  neii,(i;) 
by  Theorem  7.2.  0 

Lemma  9.3   If  a  vertex,  v,  is  simple  at  step,  t'  <  t,  it  is  simple  at  step  t. 

Proof.  The  proof  is  by  induction  on  the  step  number.  If  v  has  a  larger  ancestor  at  step 
s  then  the  lemma  follows  from  Lemma  7.1.    So  suppose  i'  is  simple  and  does  not  have  a 
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larger  p-ancestor  at  step  s.  Note  that  a  vertex  can  become  a  link  destination  at  step  s  only 
if  it  is  a  root  at  that  step  and  therefore  a  leader.  Also,  by  definition,  a  non-root  vertex  can 
acquire  a  new  child  only  if  it  is  a  link  destination.  As  v  is  not  a  link  destination  (for  then 
it  would  be  a  potential  leader  for  it  has  no  larger  p-ancestor),  the  only  way  v  can  acquire 
new  descendants  is  if  a  descendant,  w,  of  v  acquires  new  descendants.  But  then  w  is  a  link 
destination  at  step  s.  As  v  is  simple,  there  is  a  vertex,  i  on  the  path  from  w  to  v  which 
is  larger  then  v.  x  is  on  the  path  to  v  from  any  descendant  acquired  by  w.  Thus  v  is  still 
simple  at  step  a  -I-  1.  D 

Lemma  9.4  Let  v  be  any  vertex,  and  let  ti  andti  be  any  two  steps,  tj  <  <2-  U^  <^nd  qt^(v) 
are  in  the  same  chain  of  a  chain  partition  at  step  ti  and  v  is  active  at  step  t2  then 

1.  V  and  qt^{v)  are  in  the  same  chain  of  any  natural  chain  partition  at  step  t2,  and 

2.  If  V  is  not  headt^{qt^{v))  then  qt-^iv)  =  gtj(r). 

Proof.  The  proof  is  by  induction  on  the  step  number.  Assume  the  lemma  holds  for  step 
<'><!•  We  show  it  holds  for  step  t'  -f-  1.  If  t;  is  a  follower  at  step  t'  -I-  1,  then  v  and  9('+i(i') 
are  in  the  same  followset  and  the  lemma  follows  from  Lemma  7.9.  Otherwise,  v  is  a  head 
of  a  followset.  By  the  induction  hypothesis,  as  v  and  qt<[v)  are  in  the  same  chain,  the  chain 
containing  qti(v)  is  not  a  link  destination  and  i'  is  the  only  active  gc-child  of  qt'[v). 

The  g-pointer  of  v  may  change  in  one  of  two  ways:  if  i»  is  a  root  and  v  hooks  to  another 
vertex,  u,  or  if  q{v)  is  on  a  p-cycle  which  does  not  contain  v.  If  t;  is  a  root  at  step  t\  then 
7(](?;)  is  the  tail  of  the  span  containing  v  and  the  lemma  follows  from  Lemma  7.5. 

From  Lemma  7.13,  every  p-cycle  which  contains  ^('(i')  must  contain  v  since  v  is  the  only 
active  qti-zKAd  of  qt'(v).  In  addition,  the  span  containing  qt'{v)  is  not  a  link  destination  nor 
is  it  a  potential  leader.  Consequently,  qt'{v)  cannot  acquire  any  new  ^-children.  It  follows 
that  qt'+\{v)  =  qt'iv)  and  if  v  is  active  at  step  <'  -I-  1,  i'  is  the  only  active  g(/+x-child  of 
qt'+\{v).  So  q{v)  is  never  on  a  p-cycle  which  does  not  contain  v.  D 

Theorem  9.1  (Good  Partition)  Let  t  be  any  step  number.  For  any  natural  chain  par- 
tition, U,  of  G\  and  for  any  t'  >  t,  the  following  two  properties  hold. 

1.  U  is  a  chain  partition  ofGl,. 

s 

2.  For  any  natural  chain  partition,  U' ,  ofG'j,,  and  for  every  chain,  C,  of  U ,  the  active 
part  of  C ,  at  step  t' ,  is  contained  in  a  chain  of  U'. 
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Proof.  The  proof  is  by  induction  on  the  step  number.  Assume  the  lemma  holds  for  t"  with 
t  <t"  <  f.  We  show  it  holds  for  t'  also. 

In  order  to  be  a  chain  partition  of  GJ,,  U  must  satisfy  two  conditions  (see  the  definition 
above).  That  the  first  condition  holds  can  be  seen  as  follows.  For  any  vertex,  /,  if  /  is  a 
potential  leader  of  G',  it  is  also  a  potential  leader  of  G7  (Lemma  9.3).  The  first  condition 
follows  from  Lemma  7.5. 

That  the  second  condition  holds  can  be  seen  as  follows.  First,  suppose  that  v  ^  head{v) 
at  step  f  —  1.  Then  by  Lemma  9.4,  Qt'iv)  =  qt{v)  and  t;  and  qt>{v)  are  in  the  same  chain  of 
any  natural  chain  partition  at  step  t';  so  they  satisfy  property  2  of  the  chain  partition. 

Second,  suppose  that  v  =  h€ad{v)  at  step  t'  -  1.  If  v  is  a  potential  leader  condition  2 
holds  trivially.  So  supppose  v  is  not  a  potential  leader.  Then  q{v)  can  change  only  if  q(v) 
is  on  a  p-cycle  which  does  not  contain  v.  From  Lemma  7.13,  every  p-cycle  which  contains 
qt'-i{v)  must  contain  r,  since  v  is  the  only  active  9</_i-child  of  qt'--[{v).  In  addition,  the 
span  containing  qt'-i{v)  is  not  a  link  destination  nor  is  it  a  potential  leader.  Consequently, 
9('-i(^)  cannot  acquire  any  new  g-children.  It  follows  that  qt'iv)  =  qt'-\{v)  and  if  v  is  active 
at  step  t',  V  is  the  only  active  ^c-child  of  qdv).  It  remains  to  show  that  if  v  and  qt'{v) 
are  in  distinct  spans  at  step  t'  then  the  span  of  qt'(v)  at  step  t'  is  not  a  link  destination. 
By  Lemma  7.9,  v  and  qt'-i(v)  =  qt'(v)  were  in  distinct  spans  at  step  t'  -  1,  and  so  by  the 
inductive  hypothesis,  the  span  at  step  t'  -  1  of  qt'(v)  was  not  a  link  destination.  In  order 
to  become  a  link  destination,  a  cycle  including  qt'(v)  would  have  to  be  formed  at  step  t'; 
but  then  this  cycle  includes  v  as  it  is  g(/(r)'s  only  ^-child;  this  puts  v  and  q't(v)  in  the  same 
followset  and  hence  the  same  span  at  step  t',  so  the  claim  follows  trivially.  This  completes 
the  proof  of  Part  1  of  the  theorem. 

Part  2  follows  from  the  fact  that  a  chain  partition,  restricted  to  the  active  vertices,  is  a 
refinement  of  any  natural  chain  partition.  □ 

10      The  Depth  Function 

In  order  to  measure  the  progress  of  the  algorithm  we  assign  a  depth  to  each  vertex.  It  is 
natural  to  consider  the  F-distance  from  the  vertex  to  the  leader  of  the  component.  However, 
the  underlying  q-graph  might  change  between  the  time  a  process  decides  on  a  new  value  for 
F{v)  and  the  time  the  update  actually  occurs.  Therefore,  we  define  the  depth  of  a  vertex 
as  a  function  of  both  F(i')  and  its  shadow  F{v). 

The  analysis  uses  a  second  directed  graph,  called  the  /"-graph,  defined  as  follows.  For 
any  vertex,  v,  there  are  two  F-edges  emanating  from  t;:  {v,F{v))  and  {v,F(v)).  Consider 
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any  chain  partition,  U,  of  G,  and  let  t;  be  any  active  vertex.  Define  an  F\u-path  starting 
at  V  to  be  a  path  in  the  F-graph  starting  at  v  and  ending  at  the  first  vertex,  u,  for  which 
either 

1.  F{u)  is  outside  the  chain  containing  v  or, 

2.  u  is  a  potential  leader,  or 

3.  R{u)  is  a  potential  leader. 

Let  (F,  U)-da.gf{v)  denote  the  collection  of  F|t;-paths  starting  at  v  at  step  t.  We  show  (Lem- 
ma 10.1)  that  (F,  f/)-dag,(v)  is  a  directed  acyclic  graph;  therefore,  each  path  in  {F,  U)-da.gi{v) 
is  finite.  Let  l{v)  denote  the  number  of  vertices  on  the  longest  such  path.  The  F-depth  of 
V  relative  to  U  at  step  t,  {F,U)-deptht{v),  is  defined  to  be: 

(F  1J\  A     \\i  (   \  —  \   '^^^  "*"  ^     ^^  ^^^  chain  containing  i»  is  a  potential  leader    * 
r  ^P    A'")  -  I  ^^^j  Otherwise 

The  F-depth  of  a  chain,  C,  relative  to  U,  (F,  t/)-depth((C),  is  the  maximum  of  (F,  f/)- 
depth((r)  over  all  active  vertices,  v  e  C.  For  any  component,  A',  of  the  input  graph,  the 
F-depth  of  K  relative  to  U  at  step  t,  (F,  tO-depth((  A"),  is  the  sum  over  all  chains,  C  £  UDK, 
of  the  F-depth  of  C.  The  F-depth  of  G  relative  to  U  at  step  t,  (F,  [/)-depth((G'),  is  the  sum 
over  all  chains,  C  €  U,  of  the  F-depth  of  C. 

Lemma  10.1    For  any  vertex,  v,  (F,U)-dag{v)  is  a  directed  acyclic  graph. 

Proof.  Every  g-cycle  comprises  a  single  followset  (Lemma  7.16).  Also,  for  any  vertex,  v, 
F(v)  is  a  7-ancestor  of  v  (Theorem  7.2  (p.  25)).  Consider  any  chain,  C.  There  are  two  cases: 
either  C  is  contained  in  a  ^-cycle  or  none  of  its  vertices  are  on  a  9-cycle.  In  the  latter  case 
the  lemma  follows  trivially.  In  the  former  case,  the  lemma  follows  from  the  observation  that 
if  head(v)  is  on  the  9-path  from  v  to  F{v)  then  R(v)  =  head{v)  (Theorem  7.2  (p.  25)  and 
Theorem  7.1  (p.  21)).  '  .  D 

Definition  10.1  For  any  chain  partition,  U,  a  chain,  C,  of  U  is  called  U -minimal  at  step 
t  if  it  is  either  a  potential  leader  of  Ct  and  (F,  U)-deptht{C)  <  2  or  (F,  U)-deptht{C)  -  1. 

For  any  step  t,  let  Ut  denote  a  natural  chain  partition  of  U . 

When  using  a  round  number  as  a  subscript  instead  of  a  step  number,  it  refers  to  the  last 
step  in  the  given  round.  So,  for  any  round,  r,  G^  refers  to  the  q-graph  at  the  last  step  of 
round  r. 
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Lemma  10.2  For  any  round,  r,  {F,Ur+i)-depth^+i{G)  <  (F,Ur)-depthr+iiG). 

Proof.  It  follows  from  Part  1  of  Theorem  9.1  (p.  30)  that  I/,  is  a  chain  partition  of  Gj+i. 
The  lemma  follows  from  Part  2  of  Theorem  9.1  by  the  observation  that  the  F-depth  of  a 
chain  formed  by  combining  two  consecutive  chains  into  one  is  not  larger  than  the  sum  of 
the  depths  of  the  two  chains.  D 

If  we  can  show  that  for  some  constant,  q,  a  <  1,  and  for  any  chain,  C,  of  Ur, 

(F,  t^.)-depth,+i(C)  <  aiF,  t/,)-depth,(C),  (1) 

then  we  can  conclude  that 

(F,(/,)-depth,+i(G)  <  Q(F,f/,)-depth,(G). 

It  would  then  follow  from  Lemma  10.2  that,  in  this  case, 

(F,f/r+i)-depth,+i(G)  <  a(F,t/,)-depth,(G'). 

However,  life  is  not  that  simple.  In  Theorem  10.1  we  show  that  for  any  chain,  C,  if  C  is  not 
t/r-minimal  then  Equation  1  holds.  But,  some  more  work  has  to  be  done  for  C/r-minimal 
chains. 

Theorem  10.1  Let  r  be  any  round,  let  Ur  be  any  natural  partition  of  G^  and  let  C  be  any 
chain  of  Ur  which  is  not  Ur-minimal  at  round  r.    Then 

{F,Ur)-depth,^i{C)  <  a{F,Ur)-depth,{C), 

where, 

3/4     If  C  is  a  potential  leader  chain  at  round  r 
2/3     Otherwise 

Proof.  Let  v  be  any  vertex.  Assume  that  the  chain,  C,,,  containing  v  is  not  a  potential 
leader.  Consider  (F,  [/r)-dagr(r).  First  we  note  that  the  hooking  and  promotion  operations 
have  no  effect  on  this  dag;  the  claim  is  clear  for  a  hooking  operation.  For  a  promotion 
operation  note  that  a  vertex  is  promoted  only  if  it  is  a  leader,  and,  therefore,  the  head  of  a 
chain.  Now  we  consider  the  effect  of  the  doubling  operation. 

During  round  r  +  1  each  vertex  on  this  dag  executes  a  doubling  operation;  hence,  for 
every  F-edge  after  round  r  +  1  there  was  an  F-path  of  length  at  least  2  after  round  r.  Recall 
that  the  F-depth  of  a  path  is  one  larger  than  the  number  of  edges  on  the  path;  so  l{v)  -  1 
was  reduced  to  at  most  {{l[v)  -  1)/2J;  Thus,  l(v)  was  reduced  to  at  most  [^/(v)]. 
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Now,  suppose  Cv  is  a  potential  leader  chain.  Recall  that  Cy  comprises  one  span  and, 
therefore,  all  the  vertices  in  C„  are  followers  of  /,  the  potential  leader  vertex.  (Note  that 
/  may  be  a  leader,  or  even  a  root.)  Of  the  vertices  of  C„,  I  is  the  only  vertex  that  can  be 
promoted.  The  remaining  vertices  execute  a  doubling  operation  during  round  r  +  1.  An 
argument  similar  to  the  one  given  for  simple  chains  shows  that  during  the  round  the  depth 
of  tx  was  reduced  from  l{u)  +  1  to  \\l{u)  +  1] .  D 

11      The  Potential  Function 

Finally,  using  the  depth  function,  F-depth,  defined  earlier,  we  define  a  weight  function  over 
the  components  of  the  input  graph.  We  then  show  that  at  each  round  each  component  of 
the  graph  loses  a  constant  fraction  of  its  weight.  For  each  component,  its  first  stage  ends  as 
soon  as  its  weight  is  less  than  2. 

The  ancilysis  proceeds  as  follows.  During  each  round  each  chain  is  tagged.  We  show  that 
the  sum  of  the  tags  of  all  the  chciins  at  the  end  of  the  round  is  bounded  by  the  weight  lost 
by  the  component  during  the  round.  The  proof  is  completed  by  showing  that,  by  the  end  of 
the  round,  the  sum  of  the  tags  of  all  the  chains  is  at  least  a  constant  fraction  of  the  weight 
of  the  graph  the  beginning  of  the  round. 

The  weight  function  is  defined  relative  to  a  chain  partition,  U ,  as  follows.  We  choose  two 
constants,  c,,  and  cj.  We  assign  to  each  vertex,  v,  auxiliary  weight,  ex(v),  comprised  of  a 
spare,  sp{v),  of  at  most  c,  units,  and  a  debt,  db{v),  of  at  most  cj  units;  both  are  nonnegative. 
For  each  vertex,  r,  ex(v)  =  sp{v)  -  db{v).  For  each  set  of  vertices,  5,  the  spare  weight  of 
5,  spiS),  is  equal  to  the  sufti  of  the  spare  weights  of  all  the  vertices  in  5.  Define  db{S)  and 
ei(5)  similarly. 

Let  C  be  any  component  of  the  input  graph.  The  weight  of  C  relative  to  a  parti- 
tion, U,  at  step  t,  (F,  i7)-weight,(C)  is  0  if  all  the  vertices  on  C  are  inactive;  other- 
wise, (F,  6')- weighty (C)  =  (F,  6^)-depth((C)  +  ex(C).  Recall  that  for  any  two  natural 
partitions,  U  and  V,  of  G  at  step  t,  (F,  [/)-depth,(C)  =  (F,  V)-depth((C).  Therefore, 
(F,i7)-weight,(C)  =  {F,V)-v/e\ghtt{C).  Define  weight((C)  =  (F, [/,)- weight, (C)  for  any 
natural  chaiin  partition,  Ut  of  G  at  step  t. 

We  constrain  the  spare  weight  of  each  active  component  of  G,  to  be  between  c,  and  |c,. 
Therefore,  each  time  a  hooking  operation  merges  two  components  of  G,  the  spare  weight  of 
the  graph  is  reduced  by  at  least  ^c,. 

The  analysis  looks  at  each  chain,  C,  of  the  chain  partition.  If  ail  the  vertices  of  C  become 
inactive  at  round  t,  C  loses  all  its  weight.  If,  at  the  start  of  the  round,  fc  >  2,  C  reduces 
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its  weight  by  at  least  ^fc  due  to  the  doubling  steps  performed  on  all  its  internal  vertices 
(Theorem  10.1).  If  /c  <  2,  C  is  not  so  fortunate.  To  account  for  its  weight  we  use  one 
of  three  methods:  (1)  C  receives  part  of  a  tag  associated  with  some  other  vertex,  D.  (2) 
C  uses  the  reduction  in  the  spare  weight  (due  to  a  hooking  operation),  (3)  C  increases  its 
debt. 

11.1     The  Spare  Weight 

We  start  by  defining  the  auxiliary  weight  and  showing  that  each  hooking  reduces  the  spare 
weight  of  the  graph. 

Definition  11.1  A  vertex,  v,  is  called  a  cycle  destination  oi  r  if  v  is  a  link  destination  of 
r  and  r  is  the  root  of  the  component  containing  v. 

For  any  vertex,  v,  let  sptiv)  denote  the  spare  weight  of  v  at  step  t  and  let  dbt{v)  denote 
the  debt  of  v  at  step  t.  The  auxiliary  weight  is  defined  as  follows: 

1.  For  each  vertex,  r,  spo(v)  =  c,  and  dbo{v)  =  0. 

2.  The  auxiliary  weight  is  modified  at  the  writing  step  of  a  hooking  operation.  Consider 
a  root,  r,  and  assume  r  hooks  to  some  vertex,  u,  at  step  t.  Then  dbt{r)  =  dbt(u)  =  0. 
In  addition, 

(a)  if  a  ^(-descendant,  w,  of  r  is  a  link-destination  of  a  ^(-ancestor  of  u  then  spt(u) 
is  set  to  0  and  spt{w)  is  set  to  ^c,. 

(b)  If  u  is  a  ^(-descendant  of  r,  the  hooking  forms  a  p-cycle;  let  /  be  the  leader  of  the 
cycle.  spt{l)  is  set  to  c,  and  spt(u)  is  set  to  0.  If  /  7^  r  then  spt{r)  is  set  to  0  as 
well. 

(c)  Otherwise,  spt(r)  is  set  to  0. 

The  debt  is  defined  shortly.  For  the  purposes  of  this  section  we  state  the  following  assump- 
tion which  we  later  prove  (Lemma  11.6  and  Lemma  11.7). 

Assumption.  For  any  vertex,  i',  if  v  is  not  a  root  then  db{v)  =  0.  If  r  is  a  root  then 
0  <  db{v)<  Cd. 

Lemma  11.1   For  any  vertex,  v,  if  v  is  a  leader  sp{i)  =  c,. 

Proof.  The  proof  is  by  induction  on  the  step  number.  Initially,  the  lemma  holds  by 
definition.  A  root  loses  its  spare  weight  only  when  it  ceases  being  a  root.  When  a  hook 
creates  a  cycle,  the  leader  of  the  new  cycle  has  spare  weight  c,  by  definition.  D 
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Lemma  11.2  For  every  vertex,  v,ifvisa  cycle  destination  then  »p(v)  =  ^c,. 

Proof.  Let  v  and  r  be  two  vertices  and  assume  that,  at  step  t,  v  is  a  cycle  destination 
of  r.  As  r  is  a  root,  it  is  not  a  follower  (Lemma  7.2);  therefore,  r  is  a  pt-ancestor  of  v 
(Lemma  7.12). 

Let  ti  be  the  step  in  which  r  started  the  hooking  operation  it  is  performing  at  step  t. 
Then,  at  some  step  {2^  ^1  <  '2  <  ^1  A{^)  became  equal  to  t;  ^  r.  As  v  is  a  root  at  step  ti,  r  is 
not  a  g-ancestor  of  v  at  step  ^2-  Therefore,  r  is  not  a  p-ancestor  of  v  at  step  ^2  (Lemma  7.13). 
Consequently,  at  some  step,  f',  t2  <  t'  <  t,  &  hooking  step  hooked  a  p-ancestor,  i,  of  v  to 
a  p-descendant,  y,  of  r.  It  follows  from  rule  (2-a)  for  spare  weight  that  spti{v)  =  \ca  {v  is 
the  vertex  w  of  rule  (2-a)).  Between  steps  t'  and  f,  r  is  a  p-ancestor  of  v  ajid  r  is  a  root. 
Therefore,  sp{v)  does  not  change  between  steps  t'  and  t.  U 


Theorem  11.1   For  any  vertex,  v, 


if  V  is  a  leader 


sp{v)  =  <    ^c,     if  V  is  a  cycle  destination 
[  0        otherwise. 

Proof.  The  first  two  cases  follow  from  Lemma  11.1  and  Lemma  11.2,  respectively. 

We  have  to  show  that  if  a  vertex  is  neither  a  leader  nor  a  cycle  destination  its  spare 
weight  is  0.  We  show  this  by  induction  on  the  step  number.  Consider  such  a  vertex,  v, 
following  some  step  t.  It  follows  from  the  definition  that  v  did  not  receive  spare  weight  at 
step  t.  If  spt-i{v)  -^  0  then,  by  the  induction  hypothesis,  following  step  <  -  1,  f  is  either  a 
root  or  a  cycle  destination.  In  either  case,  as  i>  is  not  a  leader  or  a  cycle  destination  at  step 
t.  either  i'  creates  a  hook  or  is  hooked  to  at  step  t.  It  follows  that  sp((i')  =  0.  D 

Lemma  11.3   For  any  q-component,  C,  if  C  is  rooted,  c,  <  sp{C)  <  ^c,.  IfC  is  not  rooted, 

sp{C)  =  c,. 

Proof.  A  component,  C,  has  one  leader.  A  vertex  has  at  most  one  link  destination; 
therefore,  C  has  at  most  one  cycle  destination  (it  has  a  cycle  destination  only  if  it  is  rooted, 
the  root  hais  a  link  destination,  and  that  link  destination  is  in  C).  The  lemma  follows  from 
Theorem  11.1  (p.  36).  D 

Lemma  11.4  The  spare  weight  of  the  graph  never  increases  and  each  hooking  operation 
reduces  the  spare  weight  of  the  graph  by  at  least  ^c,  units. 
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Proof.  The  hooking  operation  is  the  only  operation  that  affects  the  spare  weight.  Consider 
a  hooking  operation  hooking  some  root,  r,  to  a  vertex,  «,  at  some  step,  t.  U  v  and  r  are  in 
the  same  9<_i-component,  C,  then  spt-iiC)  =  ^c,  (Theorem  11.1  (p.  36)).  After  the  hooking 
operation  C  is  not  rooted.  It  follows  that  spt{C)  =  c,  (Lemma  11.3). 

If  t;  and  r  are  not  in  the  same  9t_i -component,  the  sum  of  the  spare  weights  of  the 
two  components  is  at  least  2c,  (Lemma  11.3).  After  the  hooking  v  and  r  are  in  the  same 
component;  that  component  has  spare  weight  at  most  ^c,.  D 

11.2     Assignment  of  Short  Chains 

Let  t  be  any  round  and  let  Ut  be  a  natural  chain  partition  of  G  at  the  beginning  of  round 
t.  We  examine  the  chains  of  Ut- 

Chains  which  are  either  link  destinations  or  have  no  children  chains  are  called  terminal 
chains.  Recall  that  a  chain  is  minimal  either  if  it  has  depth  1  or  it  is  potential  leader  with 
depth  2.  Minimal  chains  are  associated  with  other  chains  according  to  the  following  rules. 
When  a  chjiin,  C,  is  associated  with  a  chain,  Z),  that  was  associated  with  another  chain, 
D\  by  a  prior  rule,  it  is  understood  that  C  is  associated  with  D'.  For  each  chain  we  use  the 
first  rule  that  applies,  and  only  that  rule. 

Rule  1:  A  link  destination  chain,  C,  is  not  associated  with  any  chain. 

Rule  2:  Each  (minimal)  potential  leader  chain  which  is  not  a  leader  chain  is  associated  with 
one  of  the  chains  in  its  promoting  set.  Chains  aissociated  by  this  rule  have  depth  2. 

Rule  3:  If  a  chain,  C,  has  exactly  one  child,  D,  either  C  is  a  link  destination,  C  is  a 
potential  leader  or  £)  is  a  potential  leader.  If  C  is  a  link  destination  apply  Rule  1  and 
if  C  is  a  potential  leader,  apply  Rule  2.  If  Z)  is  a  potential  leader  and  C  is  minimal, 
C  is  associated  with  D.  Chains  associated  by  this  rule  have  depth  1. 

Rule  4:  Each  minimal  depth  1  chain,  C,  which  has  2  children  chains  is  associated  with  a 
distinct  ancestor  terminal.  Each'  terminal  can  receive  at  most  one  chain  due  to  this 
rule.  (Minimal  depth  2  chains  are  potential  leaders  -  these  are  taken  care  of  either  by 
Rule  2  or  by  Rule  5.) 

Rule  5:  A  minimal  leader  chain  which  has  a  child  chain  is  associated  with  that  child.  Such 
a  chain  has  depth  2. 

Note  that  each  chain  can  have  at  most  4  chains  associated  with  it:  one  potential  leader 
(Rule  2),  one  parent  of  a  potential  leader  (Rule  3),  the  component  leader  (Rule  5)  and  one 

37 


chain  associated  with  it  by  Rule  4.  The  som  of  the  depths  of  all  the  chains  associated  with 
a  given  chain  is  at  most  6. 

The  only  minimal  chains  not  covered  by  any  of  the  five  rules  above  are  leader  chains 
which  do  not  have  any  children  chains.  We  call  such  lejider  chains  root  accrual  chains.  We 
show  below,  in  Lemma  11.6,  that  only  the  roots  of  root  accrual  chains  can  have  debts. 

11.3     Weight  Reduction 

Consider  round  t  +  1.  For  any  chain  C  let  fc  denote  {F,Ut)-deptht{C),  the  depth  of  C 
relative  to  Ut  at  the  beginning  of  the  round.  We  assign  a  tag,  tagt+\{C),  to  C  and  show 
that  during  the  round  tagt+i{C)  >  iq/c-  For  each  component,  K,  of  the  input  graph,  let 
^f^gt+iiK)  denote  the  sum  of  iafft+i(C)  summed  over  all  chains,  C,  contained  in  K.  We 
show  that  the  sum  of  the  tags  assigned  to  chains  of  K  is  bounded  by  the  weight  lost  by 
K  during  the  round;  that  is,  tagt+i{K)  <  (F,f/t)- weight, (A")  -  {F,Ut)-we\ghtt+i{K).  Note 
that  the  auxiliary  weight  of  the  graph  is  independent  of  the  chain  partition  used. 

Consider  a  chain,  C.  C  may  have  several  chains  associated  with  it.  When  we  assign 
a  tag  to  C  it  is  understood  that  C  shares  this  tag  with  all  the  chains  associated  with  it, 
each  chain  receiving  a  share  of  the  tag  proportional  to  its  depth  at  the  start  of  the  round. 
Let  Wc  be  the  sum  of  the  depths  of  all  the  chains  eissociated  with  C,  if  any  (recall  that 
Wc  <  6). 

Consider  a  chain,  T,  which  is  not  associated  with  any  other  chain.  The  following  para- 
graphs describe  the  assignment  of  tags  to  chains.  Each  case  preserves  the  property  that  the 
tag  assigned  is  accounted  for  by  a  reduction  in  the  weight  of  the  component  (this  can  be 
in  one  of  two  forms:  a  reduction  in  the  depth  of  the  chain  or  a  reduction  in  the  auxiliary 
weight  of  the  component.)  Set  c,  =  ^  ^^'^  '^d  —  '^<^«  =  f-  There  are  two  main  cases: 

T  is  not  minimal.  Then,  by  Theorem  10.1,  during  the  round  the  depth  of  T  is  reduced 
by  at  least  \fj.  Set  tagi+i{T)  -  f^/rl-  After  distributing  the  tag,  the  tag  aissociated 
with  T  is  at  least  ( [^/rl  )fT/{fT  +  ^^V)  which  is  at  least  j^/r-  Therefore,  each  chain, 
C,  associated  with  T  receives  a  tag  equal  to  at  least  -jo/c- 

T  is  minimal.  There  are  a  number  of  cases: 

•  r  is  a  link  destination  of  some  root,  r.  Then  during  the  round  r  hooks  to  T;  let 
C  be  the  chain  containing  r.  By  Lemma  11.4,  the  hooking  operation  reduces  the 
spare  weight  of  the  graph  by  at  least  ^c,.  The  debt,  if  any,  of  C  and  T  is  set  to 
0. 
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If  T  (resp.  C)  is  a  root  accrual  chain,  set  its  t<kg  to  -^c,  =  ^;  otherwise,  set  its 
tag  to  \c,  =  |.  A  root  accrual  chain  does  not  have  any  chains  associated  with 
it;  therefore,  T,  C,  and  all  the  chains  associated  with  them  receive  a  tag  of  at 
least  -^  their  original  depth. 

Let  i  be  the  number  of  root  accrual  chains  among  T  and  C  (i.e.  t  6  {0, 1,2}). 
Then  the  auxiliary  weight  is  decreased  by  at  least  A  =  ^c,  —  i  •  cj  =  c,(^  -  ||). 
The  total  of  the  tags  assigned  by  this  case  is  c,(-^  +  ^(2  -  t))  =  A. 

•  T  is  a  root  and  T  hooks  during  the  round.  Then  it  is  assigned  a  tag  of  at  least 
iq/t  by  the  preceding  paragraph. 

•  r  is  not  a  root  accrual  chain.  As  T  is  minimal  and  it  was  not  associated  with 
any  chain,  T  is  terminal.  T  is  not  a  link  destination  (otherwise  it  is  covered  by 
the  first  case).  It  follows  that  T  is  not  a  potential  leader;  hence,  /t  =  1-  During 
the  round  all  the  edges  leave  T  and  no  new  edges  enter  T.  Therefore,  all  the 
vertices  of  T  become  inactive.  Set  tagt+i{T)  =  1.  Thus,  T  and  each  of  the  chains 
jissociated  with  it  receive  a  tag  equal  to  at  least  ^  of  their  original  depth. 

•  T  is  a  root  accrual  chain.  Its  depth  is  2.  Let  r  denote  its  root.  If  T  did 
not  receive  a  tag  by  any  previous  rule,  increase  the  debt  of  r  by  ^c^  and  set 
tagi+i(T)  =  ^Cd(=  5).  The  tag  assigned  to  T  is  at  least  ^  its  original  depth. 

This,  together  with  Lemma  11.4,  which  states  that  the  spare  weight  of  the  graph  never 
increcises,  shows: 

Lemma  11.5  For  each  round  number,  t,  for  any  natural  chain  partition,  Ut,  for  each  chain, 
C,  of  Ut,  and  for  each  component,  K,  of  the  input  graph, 

tagt+i(C)>j^{F,Ui)-depth,{C)  (2) 

and 

tagt+i(K)  <  {F,Ut)-weight,{K)  -  (F,Ut)-w€ight,^iiK).  (3) 

11.4     The  Debt 

Consider  a  root  accrual  chain.  For  every  component,  C,  of  the  9-graph,  which  is  not  a 
component  of  the  input  graph,  there  is  an  edge  with  one  endpoint  in  C  and  one  outside. 
Eventually,  one  of  these  edges  gives  the  root  a  hooking  suggestion.  As  soon  as  the  root  acts 
on  this  suggestion  and  updates  its  next  pointer  the  root  has  a  current  destination  outside 
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its  component.  If  the  root  does  not  hook  and  no  root  hooks  to  it,  all  the  edges  leave  the 
component  and  it  becomes  inactive.  However,  this  process  may  take  up  to  4  rounds.  That 
is,  each  root  accrual  chain  may  avoid  making  any  progress  for  three  rounds.  We  account  for 
these  chains  by  artificially  reducing  the  weight  of  the  graph  by  increasing  the  debt  of  the 
root  of  the  chain. 

Lemma  11.6  An  active  vertex,  v,  has  nonzero  debt  only  if  it  is  the  root  of  a  root  accrual 
chain. 

Proof.  By  definition,  a  vertex  increases  its  debt  only  if  it  is  the  root  of  a  root  accrual  chain. 
Consider  a  root  chain,  C,  with  root  r.  If  C  starts  a  round,  t,  as  a  root  accrual  chain,  the 
only  way  it  can  stop  being  a  root  accrual  chain  is  if  it  either  becomes  inactive,  or,  for  some 
vertex  u,  either  r  hooks  to  u  or  u  hooks  to  r.  In  the  latter  two  cases  the  debt  of  r  (and  of 
u)  is  set  to  0  (see  Rule  2  in  the  section  describing  the  spare  weight.  Section  11.1).  D 

Definition  11.2  For  any  chain,  C,  and  any  round,  t,  let  acct{C)  denote  that  C  is  a  root 
accrual  chain  at  round  t. 

Lemma  11.7  For  any  vertex,  v,  db(v)  <  cj. 

Proof.  Consider  any  root  accrual  chain,  C,  with  root  r,  and  let  t  be  the  round  at  which  r 
became  a  root  accrual  chain.  That  is  acct(C)k-'acct-i{C).  Then  by  Lemma  11.6,  dbt(r)  =  0. 
Consider  what  may  happen  to  a  root  accrual  chain  which  does  not  hook  and  is  not  hooked 
to. 

1.  If,  fc  =  2,  then,  by  the  end  of  the  round  every  edge  with  endpoint  in  C  will  have  its 
endpoint  at  r  and  will  "know"  that  r  is  a  root. 

2.  If  at  the  beginning  of  a  round  all  the  endpoints  are  at  r  and  they  all  know  that  r  is 
a  root,  then  by  the  end  of  round  r  every  suggestion  written  to  r  (until  r  hooks  or  is 
hooked  to)  is  outside  its  component. 

3.  In  at  most  one  additional  round,  r  has  a  current  destination  outside  its  component. 

4.  If,  at  the  beginning  of  a  round,  r  has  a  current  destination  outside  of  C,  then  during 
the  round  all  the  edges  migrate  out  of  C  and  C  becomes  inactive. 

It  follows  that  there  are  at  most  three  rounds  in  wAich  the  debt  of  r  increases.  When  the 
debt  increases,  it  increases  by  ^Cd  (see  the  final  case  in  the  section  describing  the  reduction  in 
weight.  Section  11.3).  Any  sequence  of  debt  increases  of  C  starts  with  db{C)  =  0;  therefore, 
db(C)  <  Cd.  a 
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Theorem  11.2  For  each  round,  t,  and  for  each  component,  K,  of  the  input  graph,  if  K  is 
in  Stage  1  at  round  t  +  1  then  weightt+i{K)  <  ^weightt{K). 

Proof.  Recall  that  the  weight  of  a  ^-component  has  two  parts:  its  depth  and  its  auxiliary 
weight.  Let  Ut  be  any  natural  chain  partition  of  G  at  the  beginning  of  round  t  +  1,  let 
Ut+i  be  any  natural  chain  partition  of  G  at  the  end  of  round  t  +  1,  and  let  K  be  any 
component  of  the  input  graph.  In  Lemma  11.5  we  have  shown  that  for  each  chain,  C,  of  Ut, 
tagt+iiC)  >  j^{F,Ut)-deptht{C).  Therefore, 

tagt^iiK)  >  ^(F,  Ut)-deptht{K). 

We  have  also  shown  that 

tagt+iili)  <  (F,f/,)-weight.(/r)  -  (F,l/t)-weight,+,(/ir). 

Therefore, 

(F,  C/,)-weight,(A')  -  {F,  C/0-weight,+i(A')  >  ^(F,  Ut)-depth,(K). 

By  Lemma  11.3,  the  auxiliary  weight  of  each  active  component  of  the  9-graph  is  at  most 
|c,  =  ^.  Note  that  the  depth  of  each  such  component  is  at  least  2.  Therefore, 

(F,[/,)-depth,(A-)  >  ^(F,f/,)-weight,(A-). 

Consequently, 

(F,  {/,)-weight,(A')  -  (F  [/,)-weight,^i(A-)  >  ^(F,  f/,)- weight, (A'), 

and 

■13 

(Ft^,)-weight,+i(A')  <  -(Ff/0-weight,(A-), 

The  auxiliary  weight  is  independent  of  the  chain  partition  used.  It  follows  from  Lemma  10.2 
that 

weight,+,(A-)  =  (FC^<+i)-weight,+i(A-)  <  (F  t/()-weight,+i(A-). 

Therefore, 

33 
weight,+i(A')  <  —  weight((A'). 


Theorem  11.3   Stage  1  of  the  algorithm  terminates  in  O(logn)  rounds. 
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Proof.  Theorem  11.2  shows  that  as  long  as  a  component  of  the  input  graph  is  in  Stage  1  of 
the  algorithm  its  weight  is  reduced  by  at  least  a  constant  multiplicative  factor,  k  =  1  —  ^• 
Consider  any  component,  K,  of  the  input  graph.  The  weight  of  K  is  nonnegative. 
Furthermore,  as  long  as  any  of  the  vertices  of  K  are  active,  the  depth  of  K  is  at  least  2. 
Initially,  the  weight  of  the  graph  is  (2  +  Ct)n  <  6n.  Therefore,  after  O(logn)  rounds  all  the 
edges  must  have  terminated  and  the  algorithm  is  in  Stage  2.  D 

12     Stage  II 

The  second  stage  of  the  algorithm  begins  after  all  the  edges  terminate  and  each  component, 
C,  of  the  input  graph  comprises  exactly  one  ^'''-component.  Let  T,  denote  the  step  at  which 
the  last  edge  of  C  terminates.  During  the  second  stage  the  current  destination  edges  of  G,+ 
are  replaced  by  edges  of  G,  until  the  vertices  of  C  comprise  one  g-component.  Additional 
applications  of  the  doubling  procedure  are  performed  until  all  the  vertices  of  C  have  a 
common  A  pointer. 

Each  component  of  the  ^'''-graph  has  one  cycle,  possibly  a  self-loop.  We  call  a  vertex  a 
core  vertex  if  it  is  on  a  cycle  of  Gj^;  it  is  called  a  peripheral  vertex,  otherwise.  (Note  that 
whether  a  vertex  is  a  core  vertex  or  a  peripheral  vertex  is  determined  by  the  9+-graph  at 
step  T,  and  does  not  subsequently  change.) 

We  introduce  a  function,  H,  defined  as  follows:  H(v)  is  equal  to  next{v)  if  v  is  a  root; 
H{v)  =  F{v)  otherwise.  Also,  let  H{v)  denote  the  shadow  of  the  corresponding  value.  Let 
C  be  any  component  of  the  input  graph,  and  let  /  be  the  final  leader  of  the  component. 
Define  an  auxiliary  directed  graph,  the  7/-graph,  over  the  vertices  of  G  as  follows.  For  each 
vertex  r,  there  are  two  directed  edges  emanating  from  v.  {v,H{v))  and  {v,H(v)). 

We  assign  a  weight  to  each  vertex,  v\  for  any  step,  t,  wtt{v),  the  weight  of  v  after  step 
t,  is  2  if  V  is  a  root;  wtt{v)  =  1  otherwise.  For  a  peripheral  vertex,  v,  define  an  7/-path 
starting  at  v  to  be  any  path  in  the  ^-graph  starting  at  v  and  ending  immediately  before  the 
first  core  vertex.  For  a  core  vertex,  v,  an  F-path  starting  at  v  is  defined  to  be  a  path  in  the 
/T-graph  starting  at  v  and  ending  at  the  first  vertex,  u,  for  which  R{u)  =  /.  The  ^-weight 
of  an  .H^-path,  P,  is  the  sum  of  the  weights  of  the  vertices  on  the  path.  The  H -depth  of  v 
at  round  t,  H-deptht{v),  is  the  maximum  over  all  /T-paths  starting  at  v  of  the  /f-weight  of 
the  path.  For  any  component,  C,  define  H-deptht{C  no  be  the  maximum  over  all  vertices, 
u  6  C,  of  //■-depth((u).  We  show  that  for  any  vertex,  v,  O(log  \C\)  rounds  after  all  the  edges 
terminate,  if  u  is  a  peripheral  vertex  then  H{v)  is  a  core  vertex,  and  if  v  is  a  core  vertex, 
H{v)  is  the  final  leader,  /,  where  \C\  is  the  number  of  vertices  in  the  component  containing 
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Lemma  12.1   The  following  holds  for  any  vertex,  v,  and  for  any  two  steps,  Ti  <  ti  <  tj. 


1.  If  V  is  a  core  vertex  then  ^<,(w)  and  Hti{v)  are  core  vertices. 

2.  If  Ht^{v)  is  a  core  vertex,  then  Ht^iv)  is  a  core  vertex. 

3.  If  v  is  a  peripheral  vertex  then  there  is  an  Hti-path  from  v  to  Ht^iv). 

Proof.  By  Theorem  6.1,  once  all  the  edges  terminate  each  component  of  the  input  graph 
corresponds  to  one  component  of  the  p'''-graph,  and  therefore  to  one  component  of  the  q"^- 
graph  (by  Lemma  7.14  (p.  24)).  Also,  once  a  leader  is  promoted  during  Stage  2  it  remains 
a  root  (for  there  are  no  edges  to  give  hooking  suggestions).  From  this  it  follows  that  the 
only  vertex  in  a  g^-component  that  can  be  promoted  during  Stage  2  is  the  final  leader 
of  the  component,  which  must  be  a  core  vertex.  In  addition,  during  Stage  2,  a  vertex  v 
replaces  n{v)  by  H{v)  and  H{v)  by  n{H{v)).  The  lemma  follows  by  induction  on  the  step 
number.  D 

Lemma  12.2  For  any  vertex,  v,  and  for  any  round,  r,  if  H -depth^{v)  >  3  then 
H -depth^+-^{v)  <  ^H-depth^{v). 

Proof.  For  any  two  vertices,  x  and  y,  if  there  is  a  path  in  the  /^-graph  from  x  to  y,  define 
the  distance  from  x  to  y  to  be  the  difference  /f-depth(i)  -  fl^-depth(j/). 

Let  Vi,  V2,  V3  be  any  three  consecutive  vertices  on  some  /fr+i-path.  Then  there  is  a  path 
in  the  ^^-graph  from  Vi  to  ^3.  Consider  the  /Tr+i -distance,  d,  from  i^i  to  V3. 

d  =  2:  Then  Vi  is  not  a  root  following  round  r  +  1;  so  during  round  r  +  l,Vi  either  executed 
a  doubling  operation  or  a  hooking  operation.  Therefore,  the  distance  from  vi  to  V3  in 
Hr  is  at  least  3. 

d  =  3:  One  of  vi  or  V2  is  a  root  following  round  r  +  1.  If  a  vertex,  u,  is  a  root  following 
round  r  +  1,  then  during  the  round  it  must  have  advanced  its  next  pointer  (otherwise 
it  would  have  hooked).  Therefore,  the  distance  from  Vi  to  V3  in  Hr  is  at  least  4. 

d  =  4:  Both  v-[  and  i'2  are  roots  following  round  r  +  1.  Both  Vi  and  V2  advanced  their  next 
pointers.  Therefore,  the  distance  from  Uj  to  ^3  in  Hr  is  at  least  6. 

Thus,  taking  into  account  the  last  vertex  on  the  path,  we  conclude  that  if  the  heaviest 
^r+i-path  has  at  least  two  edges,  in  the  worst  case,  the  depth  of  v  was  reduced  by  a 
at  least  a  factor  of  f .  On  the  other  hand,  if  the  heaviest  /fr+i-path  has  only  one  edge, 
^r+i-depth(t;)  <  2.  The  lemma  follows.  D 
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Theorem  12.1  Stage  2  completes  after  O(logn)  rounds. 

Proof.  Consider  any  component,  C,  of  the  input  graph.  Immediately  after  step  T,,  the 
^-depth  of  C  is  at  most  2|C|.  For  any  peripheral  vertex,  v,  if  t7  is  a  root  and  H-depth{v)  =  2 
or  if  ^-depth(r)  =  1  then  F{v)  is  a  core  vertex.  It  follows  from  Lemma  12.2  that  for  some 
k,  k  =  0(log  \C\)  +  Ti,  for  any  peripheral  vertex,  u,  in  C,  Fk{Fk{u))  is  a  core  vertex  (Fi(u) 
may  be  a  peripheral  root).  Also,  for  every  core  vertex,  u,  in  C,  Rk{ii)  =  /.  In  particular, 
Rk{l)  =  '  and  /  is  promoted  by  the  end  of  round  k  +  1.  From  this  it  follows  that  Fk+\{1)  =  I, 
and  the  process  associated  with  /  terminates  during  round  k  +  1. 

It  follows  that  for  some  k',  k'  <  k  +  4,  ail  the  processes  terminate.  D 

We  have  shown 

Theorem  12.2  (Connectivity)  There  is  an  APRAM  algorithm  which  computes  the  con- 
nected component  of  an  undirected  graph  in  O(logn)  rounds  using  n  +  e  processes,  where  n 
is  the  number  of  vertices  of  the  graph  and  e  is  the  number  of  edges. 
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